THEORETICAL A N D C O M P U T A T I O N A L CHEMISTRY
Pauling’s Legacy Modem Modelling of the Chemical Bond
THEORETICAL AND COMPUTATIONAL CHEMISTRY
SERIES EDITORS
Professor P. Politzer
Professor Z.B. Maksi6
Department of Chemistry University of New Orleans New Orleans, LA 70418, U.S.A.
Rudjer Bos"kovi~Institute P.O. Box 1016, 10001 Zagreb, Croatia
VOLUME 1 Quantitative Treatments of Solute/Solvent Interactions
P. Politzer and J.S. Murray (Editors) VOLUME 2 Modern Density Functional Theory: A Tool for Chemistry J.M. Seminario and P. Politzer (Editors)
VOLUME 3 Molecular Electrostatic Potentials: Concepts and Applications J.S. Murray and K. Sen (Editors)
VOLUME 4 Recent Developments and Applications of Modern Density Functional Theory J.M. Seminario (Editor)
VOLUME 5 Theoretical Organic Chemistry
C. Pdrkdnyi (Editor) VOLUME 6 Pauling's Legacy: Modern Modelling of the Chemical Bond Z.B. Maksic"and W.J. Orville-Thomas (Editors)
O
THEORETICAL AND C O M P U T A T I O N A L CHEMISTRY
Pauling's Legacy Modem Modelling of the Chemical Bond
Edited by Z.B. M a k s i ~
Rudjer Bo~kovid. Institute P.O. Box 1 0 1 6
Bijeni~ka 5 4 10001 Zagreb, Croatia W.J. Orville-Thomas
Caer Cae Melyn Aberystwyth Dyfed SY23 2HA, Wales,
UK
ELSEVIER 1999 Amsterdam - Lausanne - New York - Oxford - Shannon - S i n g a p o r e - Tokyo
ELSEVIER SCIENCE B.V. Sara Burgerhartstraat 25 P.O. Box 211, 1000 AE Amsterdam, The Netherlands 9 1999 Elsevier Science B.V. All rights reserved. This work is protected under copyright by Elsevier Science, and the following terms and conditions apply to its use: Photocopying Single photocopies of single chapters may be made for personal use as allowed by national copyright laws. Permission of the publisher and payment of a fee is required for all other photocopying, including multiple or systematic copying, copying for advertising or promotional purposes, resale, and all forms of document delivery. Special rates are available for educational institutions that wish to make photocopies for non-profit educational classroom use. Permissions may be sought directly from Elsevier Science Rights & Permissions Department, PO Box 800, Oxford OX5 1DX, UK; phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail"
[email protected]. You may also contact Rights & Permissions directly through Elsevier's home page (http://www.elsevier.nl), selecting first 'Customer Support', then "General Information', then 'Permissions Query Form'. In the USA, users may clear permissions and make payments through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA; phone: (978) 7508400, fax: (978) 7504744, and in the UK through the Copyright Licensing Agency Rapid Clearance Service (CLARCS), 90 Tottenham Court Road, London W1P 0LP, UK; phone: (+44) 171 436 5931; fax: (+44) 171 436 3986. Other countries may have a local reprographic rights agency for payments. Derivative Works Tables of contents may be reproduced for internal circulation, but permission of Elsevier Science is required for external resale or distribution of such material. Permission of the publisher is required for all other derivative works, including compilations and translations. Electronic Storage or Usage Permission of the publisher is required to store or use electronically any material contained in this work, including any chapter or part of a chapter. Contact the publisher at the address indicated. Except as outlined above, no part of this work may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission of the publisher. Address permissions requests to: Elsevier Science Rights & Permissions Department, at the mail, fax and e-mail addresses noted above. Notice No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made.
First edition 1999 Libr~y of Congress Cataloging in Publication Data A catalog record from the Library of Congress has been applied for. ISBN: 0-444-82508-8 (~ The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992 (Permanence of Paper). Printed in The Netherlands.
PREFACE
Theory and experiment in chemistry today provide a wealth of data, but such data have no meaning unless they are correctly interpreted by sound and transparent physical models. Linus Pauling was second to none in the modelling of molecular properties, as we discuss later in the prologue to this book. Indeed, many of his models have served chemistry for decades, and that has been his lasting legacy for chemists all over the world. The aim of this book has been to put such simple models into the language of modern quantum chemistry, thus providing a deeper justification for many of Pauling's ideas and concepts. However, it should be stressed that many contributions to this book, written by some of the world's most prominent theoretical chemists, do not merely follow Pauling's footprints. By taking his example, they made bold leaps forward to overcome the limitations of the old models thus opening new scientific vistas. We are grateful for the effort, inspiration, care, and patience the authors have shown in the preparation of their contributions to this book. We trust that this spirit of "Pauling's legacy" will apeal to many chemists, both younger and older, in the areas ranging from chemical physics to physical organic chemistry. We thank Mr.B. Kova~evi5 for some technical help.
Z.B. Maksi5 W.J. Orville-Thomas
December 1998
This Page Intentionally Left Blank
vii
TABLE
OF CONTENTS
Prologue:
The Chemical Bond on the Eve of the 21st Century ....................... X I X Zvonimir B. Maksid and W. J. Orville - Thomas
Chapter 1. Theoretical Treatise on Molecular Structure and Geometry .............. 1 Jerzy Cioslowski 1. 2. 3. 4.
5. 6. 7. 8.
I n t r o d u c t i o n : T h e H i e r a r c h y of M o d e l s in C h e m i s t r y ......................... 1 M o l e c u l a r W a v e f u n c t i o n s ...................................................................... 3 D e c o u p l i n g of N u c l e a r and E l e c t r o n i c D e g r e e s of F r e e d o m ................. 4 The R e l e v a n c e of S p e c t r o s c o p i c States .................................................. 6 4.1. T i m e D e p e n d e n c e ........................................................................ 6 4.2. I n t e r a c t i o n s w i t h E x t e r n a l F i e l d s ............................................... 7 4.3. I n t e r m o l e c u l a r I n t e r a c t i o n s ....................................................... 8 D e s c r i p t i o n of M o l e c u l a r P h e n o m e n a with S p e c t r o s c o p i c and L o c a l i z e d States .................................................................................... 11 T h e C o n c e p t of M o l e c u l a r G e o m e t r y .................................................. 13 T h e C o n c e p t of M o l e c u l a r S t r u c t u r e ................................................... 15 C o n c l u d i n g R e m a r k s ............................................................................ 16
Chapter 2. Beyond the Born-Oppenheimer Approximation ................................. 21 D. B. Kinghorn and L. Adamowicz 1. I n t r o d u c t i o n .......................................................................................... 21 2. E q u i v a l e n t T r e a t m e n t of N u c l e i and E l e c t r o n s 2.1. E x p l i c i t S e p a r a t i o n of the C e n t e r - o f - m a s s M o t i o n ( M e t h o d I) ................................................................................... 22 2.2. E f f e c t i v e N o n - a d i a b a t i c M e t h o d ( M e t h o d II) ........................... 25 3. G r o u n d - s t a t e W a v e f u n c t i o n ................................................................. 29 4. V a r i a t i o n a l C a l c u l a t i o n s ....................................................................... 31 5. S a m p l e A p p l i c a t i o n s ............................................................................. 37 5.1. E x p l i c i t S e p a r a t i o n of the C e n t e r - o f - m a s s M o t i o n in V a r i a t i o n a l C a l c u l a t i o n s of E l e c t r o n Affinities of H-, D- and T-. ................................................................................... 37 5.2. C a l c u l a t i o n on H D ~ with E f f ect i ve N o n - a d i a b a t i c M e t h o d ....................................................................................... 39 6. G e n e r a l N - b o d y N o n - a d i a b a t i c W a v e f u n c t i o n .................................. 42 7. S u m m a r y ............................................................................................... 44
Chapter 3. The Mills-Nixon Effect: Fallacies, Facts and Chemical Relevance .............................................................................................. 47 Zvonimir B. Maksi~, Mirjana Eckert-Maksi~, Otilia M6 and Manuel Yd~ez 1. I n t r o d u c t i o n .......................................................................................... 47
viii 2. The M i l l s - N i x o n Effect: The first E x p e r i m e n t a l R e s u l t and T h e o r e t i c a l I n t e r p r e t a t i o n by S u t t o n and P a u l i n g .............................. 48 2.1. Definition of the M N - e f f e c t and some C o m m o n Fallacies ..................................................................................... 49 3. Structural C o n s e q u e n c e s of the M N - E f f e c t ......................................... 53 3.1. The R o l e of R e h y b r i d i z a t i o n ..................................................... 53 3.2. The Role of 7t-Delocalization .................................................... 57 3.3. P a r a d i g m a t i c I n d a n and T e t r a l i n Cases ................................... 59 3.4. The Ring Size Effect .................................................................. 61 3.5. The Effect of the D o u b l e Bond and L o n e Pair(s) ..................... 67 3.6. Amplification of the M i l l s - N i x o n Effect .................................... 72 3.7. E x t e n d e d ~ - S y s t e m s : [ N ] p h e n y l e n e s .......................................... 75 4. R e v e r s e d M i l l s - N i x o n Effect ................................................................. 79 5. C h e m i c a l C o n s e q u e n c e s of the M i l l s - N i x o n Effect .............................. 85 5.1. E l e c t r o p h i l i c S u b s t i t u t i o n R e a c t i v i t y ........................................ 85 5.2. M i s c e l l a n e o u s P h y s i c a l and C h e m i c a l Properties .................... 94 6. C o n c l u d i n g R e m a r k s ............................................................................ 96
Chapter 4. Predicting Structures of Compounds in the Solid State by the G l o b a l O p t i m i z a t i o n A p p r o a c h ................................................... 103 J.C. SchOn and M. Jansen 1. 2. 3. 4.
I n t r o d u c t i o n ........................................................................................ 103 T h e E n e r g y L a n d s c a p e ....................................................................... 105 T h e L i d - and the T r e s h o l d - a l g o r i t h m ............................................. 108 S t r u c t u r e P r e d i c t i o n at L o w T e m p e r a t u r e s ....................................... 110 4.1. G e n e r a l Aspects ....................................................................... 110 4.2. Specific O p t i m i z a t i o n A l g o r i t h m s ........................................... 111 4.3. Specific Empirical Potentials .................................................. 112 5. E x a m p l e s ............................................................................................. 114 6. C o n n e c t i o n s to Earlier Studies of the E n e r g y Surface of C o m p l e x S y s t e m s ................................................................................ 123
Chapter 5. P o l a r i z a b i l i t y a n d H y p e r p o l a r i z a b i f i t y of Atoms and Ions ............... 129 David M. Bishop 1. Historic T i m e s ..................................................................................... 129 2. T h e P a p e r ............................................................................................ 131 3. S u r v e y of P o l a r i z a b i l i t y and H y p e r p o l a r i z a b i l i t y C a l c u l a t i o n s ........ 134 3.1. Static Dipole Polarizabilities (~) ............................................. 135 3.1.1. The He Isoelectronic Series ....................................... 136 3.1.2. The Ne Isoelectronic Series ....................................... 138 3.1.3. The Ar Isoelectronic Series ....................................... 138 3.2. Static D i p o l e H y p e r p o l a r i z a b i l i t i e s ......................................... 138 3.2.1. The H A t o m ............................................................... 139 3.2.2. The He Isoelectronic Series ....................................... 139 3.2.3. The Ne Isoelectronic Series ....................................... 139 3.2.4. The Ar Isoelectronic Series ....................................... 140
ix
3.3. D y n a m i c Dipole Polarizabilities and H y p e r p o l a r i z a b i l i t i e s . . . 141 3.3.1. T h e H A t o m ............................................................... 142 3.3.2. The He Isoelectronic Series ....................................... 142 3.3.3. The Ne and Ar I s o e l e c t r o n i c Series .......................... 143 4. C o n c l u s i o n s and O t h e r Aspects ........................................................... 143
Chapter 6. Molecular Polarizabilities and Magnetizabilities .............................. 147
P~I Dahle, Keneth Ruud, Trygve Helgaker and Peter R. Taylor 1. I n t r o d u c t i o n ........................................................................................ 147 2. M o l e c u l a r Properties as E n e r g y D e r i v a t i v e s ..................................... 149 3. M o l e c u l a r Properties in the D i a g o n a l R e p r e s e n t a t i o n of the H a m i l t o n i a n ........................................................................................ 156 4. E x p l i c i t E x p r e s s i o n s for Electric and M a g n e t i c P r o p e r t i e s .............. 159 5. L o n d o n O r b i t a l s .................................................................................. 162 6. T h e C a l c u l a t i o n of M o l e c u l a r M a g n e t i z a b i l i t i e s : C o m p a r i s i o n with E x p e r i m e n t ........................................................... 170 7. P a s c a l ' s Rule and G-Ring C u r r e n t s ................................................... 172 8. A r o m a t i c M o l e c u l e s and ~ - B o n d C u r r e n t s ....................................... 178 9. T h e P o l a r i z a b i l i t y of N o r m a l - and C y c l o - alkanes .......................... 179 10. The P o l a r i z a b i l i t y of P o l y a r o m a t i c H y d r o c a r b o n s ........................... 183 11. C o n c l u s i o n s ......................................................................................... 184
Chapter 7. The Concept of Electronegativity of Atoms in Molecules ................. 189
Juergen Hinze 1. 2. 3. 4. 5. 6. 7.
I n t r o d u c t i o n ........................................................................................ 189 P a u l i n g ' s Definition of E l e c t r o n e g a t i v i t y ........................................... 190 M u l l i k e n ' s Definition of E l e c t r o n e g a t i v i t y ......................................... 193 Orbital E l e c t r o n e g a t i v i t y and Electrical P o t e n t i a l ............................ 195 Orbital E l e c t r o n e g a t i v i t y V a l u e s ........................................................ 199 E l e c t r o n e g a t i v i t y E q u a l i z a t i o n and C h a r g e D i s t r i b u t i o n .................. 202 M o l e c u l a r P r o p e r t i e s .......................................................................... 204 7.1. B o n d L e n g t h s ........................................................................... 205 7.2. B o n d E n e r g i e s .......................................................................... 208 8. C o n c l u s i o n ........................................................................................... 210
Chapter 8. On Hybrid Orbitals in M o m e n t u m Space ......................................... 213
B. James Clark, Hartmut L. Schmider and Vedene H. Smith, Jr. 1. I n t r o d u c t i o n ....................................................................................... 1213 2. F o u r i e r T r a n s f o r m s of P o s i t i o n - s p a c e H y b r i d s ................................. 214 3. H y b r i d s in M o m e n t u m Space ............................................................. 215 3.1. H y b r i d s of the spa-Type ........................................................... 215 3.2. H y b r i d s I n v o l v i n g d-Orbitals .................................................. 217 4. M o m e n t s of the H y b r i d Orbitals ........................................................ 226 5. C o n c l u s i o n ........................................................................................... 228
Chapter 9. Theory as a Viable Partner for Experiment- The Quest for Trivalent Silylium Ions in Solution .............................................. 231 Carl-Henrik Ottosson, Elfi Kraka and Dieter Cremer 1. I n t r o d u c t i o n ........................................................................................ 231 1.1. W h y to Investigate S i l y l i u m Ions in Solution ? ...................... 232 1.2. C o n n e c t i o n to P a u l i n g ' s W o r k and Scope of the Article ....... 234 2. The N M R / a b i n i t i o / I G L O M e t h o d ................................................... 235 3. The S i l y l i u m Ion P r o b l e m ................................................................... 242 3.t. Properties of S i l y l i u m Ions in the Gas Phase .......................... 243 4. S i l y l i u m and C a r b e n i u m Ions in Solution. I n t e r a c t i o n of S o l v e n t s and C o u n t e r i o n s ............................................................... 246 4.1. Definition of a Nearly Free Silylium Ion R3Si ~ in S o l u t i o n ................................................................................ 246 4.2. C a r b e n i u m Ions R 3 C + in Solution ........................................... 254 5. S o l v a t i o n of Neutral Silyl C o m p o u n d s R3SiX and S i l y l i u m Ions R3Si +.............................................................................. 256 5.1. N e u t r a l S i - c o m p o u n d s in S o l u t i o n ......................................... 257 5.2. Specific C o m p l e x a t i o n of R3Si + by Nucleophilic Solvent Molecules .............................................. 258 5.3. C o u n t e r i o n s used in R e s e a r c h on R3Si ~ a n d R 3 C ~- Ions in S o l u t i o n .............................................................. 260 6. Structure D e t e r m i n a t i o n of Silyl Cations in S o l u t i o n ....................... 262 6.1. S t r u c t u r e D e t e r m i n a t i o n by the N M R / a b i n i t i o / I G L O M e t h o d ................................................. 263 7. I n t r a m o l e c u l a r Solvation of S i l y l i u m Ions ......................................... 266 7.1. Strong I n t r a m o l e c u l a r S o l v a t i o n of Silyl Cations .................. 267 7.2. W e a k I n t r a m o l e c u l a r Solvation of S i l y l i u m Ions ................... 272 8. A p p r o a c h i n g a nearly Free S i l y l i u m Ion in S o l u t i o n ......................... 277 8.1. T r i a l k y l s i l y l i u m Ions in A r o m a t i c Solvents ............................ 277 8.2. Silyl Substituted S i l y l i u m Ions in Solution ............................. 281 8.3. D i a l k y l b o r y l Substituted S i l y l i u m Ions in S o l u t i o n ................ 284 9. The S o l u t i o n of the P r o b l e m : First G e n e r a t i o n of a Free S i l y l i u m Cation in C o n d e n s e d Phases ....................................... 287
Chapter 10. Bond Energies, Enthalpies of Formation, and Homologies: The Energetics of Aliphatic and Alicyclic Hydrocarbons and some of their Derivatives ................................................................... 303 Suzanne W. Slayden and Joel F. Liebman 1. T e t r a c o o r d i n a t i o n , T e t r a h e d r a l G e o m e t r y and H y b r i d i z a t i o n ..................................................................................... 303 2. The N u m b e r of C o m p o u n d s and the N e c e s s i t y for I n t e r c o n n e c t i o n s , H o m o l o g i e s and H o m o l o g o u s Series ................... 304 3. H o m o l o g o u s Series: The 1-Substituted Alkanes ............................... 304 4. H o m o l o g o u s Series: C y c l o a l k a n e s ..................................................... 310 5. H o m o l o g o u s Series: Saturated P o l y c y c l i c H y d r o c a r b o n s ................ 310 6. T e t r a h e d r a n e and [ 1.1. l ] P r o p e l a n e .................................................. 312
xi
6.1. T e t r a h e d r a n e ........................................................................... 313 6.2. [ 1 . 1 . 1 ] P r o p e l l a n e ...................................................................... 315
Chapter 11. Stabilization and Destabilization Energies of Distorted Amides ..... 321 Arthur Greenberg and David T. Moore 1. I n t r o d u c t i o n ....................................................................................... 321 1.1. 1.2. 1.3. 1.4. 1.5.
C h e m i c a l I m p l i c a t i o n s of S t r a i n e d A m i d e s and L a c t a m s ..... 321 B i o l o g i c a l I m p l i c a t i o n s ............................................................ 322 Effects of D i s t o r t i o n on A c i d / B a s e P r o p e r t i e s ........................ 323 D e f i n i n g D i s t o r t i o n of the A m i d e L i n k a g e ............................. 323 L a r g e r B r i d g e h e a d B i c y c l i c L a c t a m s : Are T h e y H y p e r s t a b l e ? ............................................................................ 324 2. B a c k g r o u n d ........................................................................................ 325 2.1. E n e r g e t i c s of D i s t o r t e d L a c t a m s ............................................. 325 2.2. B o n d i n g in L a c t a m s : Is there Still a R o l e for R e s o n a n c e ? .............................................................................. 326 2.3. P r o t o n Affinities of B r i d g e h e a d B i c y c l i c L a c t a m s : N vs. O ..................................................................................... 327 2.4. State of C a l c u l a t i o n a l Studies of D i s t o r t e d A m i d e L i n k a g e s ................................................................................... 328 3. C o m p u t a t i o n a l S t u d i e s ...................................................................... 328 3.1. M o l e c u l a r M e c h a n i c s .............................................................. 328 3.2. Ab initio C a l c u l a t i o n s ............................................................. 334 3.3. S e m i - e m p i r i c a l R e s u l t s ............................................................ 337 4. S u m m a r y ............................................................................................ 343
Chapter 12. Some Chemical and Structural Factors Related to the Metastabilities of Energetic Compounds ................................... 347 Peter Politzer and Jane S. Murray 1. I n t r o d u c t i o n ....................................................................................... 347 2. I m p a c t / S h o c k S e n s i t i v i t y and M o l e c u l a r St ruct ure: S o m e B a c k g r o u n d ............................................................................ 348 2.1. S t r u c t u r e - sensitivity R e l a t i o n s h i p ........................................ 348 2.2. S o m e Specific D e c o m p o s i t i o n P a t h w a y s ................................. 349 3. R e l a t i o n s h i p s B e t w e e n I m p a c t Sensitivities and M o l e c u l a r S u r f a c e E l e c t r o s t a t i c P o t e n t i a l s ...................................... 351 3.1. A n a l y s i s and C h a r a c t e r i z a t i o n of S u r f a c e P o t e n t i a l s ............ 351 3.2. U n s a t u r e t e d C - N i t r o D e r i v a t i v e s : N i t r o a r o m a t i c s and N i t r o h e t e r o c y c l e s ..................................................................... 352 3.3. I m p a c t S e n s i t i v i t y and S u r f a c e P o t e n t i a l I m b a l a n c e ............. 354 4. S u m m a r y ............................................................................................ 358
Chapter 13. Valence Bond Theory: A Re-examination of Concepts and Methodology ...................................................................................... 365 Roy Mc Weeny 1. I n t r o d u c t i o n ....................................................................................... 365
xii
2. T h e E l e c t r o n - p a i r B o n d : S o m e P r e l i m i n a r i e s .................................. 3. C l a s s i c a l VB T h e o r y : P e r f e c t - p a i r i n g and R e s o n a n c e ..................... 3.1. S y m m e t r y C o n s i d e r a t i o n s ....................................................... 3.2. C a l c u l a t i o n of the E n e r g y ........................................................ 4. T h e Rise and Fall of C l a s s i c a l VB T h e o r y ........................................ 5. M o d e r n VB T h e o r y ........................................................................... 5.1. VB T h e o r y w i t h O r t h o g o n a l O r b i t a l s ..................................... 5.2. T h e " N i g h t m a r e of the I n n e r S h e l l s " . ..................................... 5.3. VB T h e o r y with N o n - o r t h o g o n a l O r b i t a l s .............................. 5.4. C o n n e c t i o n w i t h O t h e r M e t h o d s ............................................. 6. S o m e I l l u s t r a t i v e A p p l i c a t i o n s .......................................................... 6.1. T h e W a t e r M o l e c u l e ................................................................ 6.2. M e t h y l l i t h i u m .......................................................................... 6.3. L i t h i u m F l u o r i d e ..................................................................... 6.4. B e n z e n e and Its Ions ................................................................ 7. C o n c l u s i o n ..........................................................................................
365 371 372 376 380 383 383 384 387 389 392 392 394 394 396 397
Chapter 14. Advances in Many-body Valence-bond Theory ............................... 403 Douglas J. Klein 1. 2. 3. 4.
I n t r o d u c t o r y S u r v e y .......................................................................... VB T h e o r y : Bases, M o d e l s , & R e s o n a n c e ......................................... M a n y - b o d y T h e o r y ............................................................................ M a n y - b o d y T e c h n i q u e s for VB M o d e l s ............................................ 4.1. C o n f i g u r a t i o n I n t e r a c t i o n ....................................................... 4.2. M a n y - b o d y P e r t u r b a t i o n T h e o r y ............................................ 4.3. C l u s t e r and M o m e n t M e t h o d s ................................................ 4.4. S p i n - w a v e s and G r e e n ' s F u n c t i o n s ......................................... 4.5. W a v e - f u n c t i o n C l u s t e r E x p a n s i o n .......................................... 4.6. M o n t e C a r l o C o m p u t a t i o n s ..................................................... 4.7. R e n o r m a l i z a t i o n - g r o u p T e c h n i q u e s ........................................ 4.8. M i s c e l l a n y ................................................................................ 5. O v e r v i e w and P r o s p e c t s ....................................................................
403 405 407 409 409 410 411 411 412 413 413 413 414
Chapter 15. Ab Initio Valence Bond Description of Diatomic Dications ............ 423 Harold Basch, Pinchas Aped, Shmaryahu Haz and Moshe Goldberg 1. I n t r o d u c t i o n ....................................................................................... 424 2. He2 + .................................................................................................. 426 3. 022+ ..................................................................................................... 428 4. N F 2+.................................................................................................... 433 5. S u m m a r y ........................................................................................... 438
Chapter 16. One-electron and Three-electron Chemical Bonding, and Increased-Valence Structures ........................................................... 449 Richard D. Harcourt 1. I n t r o d u c t i o n ....................................................................................... 449
xiii
2. The O n e - e l e c t r o n B o n d ..................................................................... 450 3. The O n e - e l e c t r o n B o n d and N o n - p a i r e d Spatial O r b i t a l S t r u c t u r e s ............................................................................. 452 4. A T h e o r e m ......................................................................................... 454 5. The T h r e e - e l e c t r o n Bond, or T h r e e - e l e c t r o n H a l f - b o n d ................. 454 5.1. P a r a m a g n e t i c Electron Rich M o l e c u l e s and M o l e c u l a r Ions that I n v o l v e A t o m s of M a i n - g r o u p E l e m e n t s ................. 456 5.2. H y p o l i g a t e d Transition Metal C o m p l e x e s , such as High-spin (S=2) [Fe(H20)6] a+ .................................................. 456 5.3. F+-type C o l o u r Centers ............................................................ 457 5.4. n - T y p e S e m i c o n d u c t o r s ........................................................... 458 5.5. C o n d u c t i o n in Alkali Metals in the Solid State ...................... 458 6. Instability of T h r e e - e l e c t r o n Bonds .................................................. 458 7. The T h r e e - e l e c t r o n B o n d with F o u r or M o r e A O s .......................... 460 8. T h r e e - e l e c t r o n B o n d s and I n c r e a s e d - v a l e n c e Structures for F o u r - e l e c t r o n T h r e e - c e n t r e B o n d i n g ................................................ 462 9. I n c r e a s e d - v a l e n c e Structures and M u l l i k e n - D o n o r acceptor C o m p l e x e s ........................................................................... 464 10. I n c r e a s e d - v a l e n c e Structures and SN2 Reactions ............................. 465 11. T h r e e - E l e c t r o n B o n d s and F i v e - e l e c t r o n T h r e e - c e n t r e B o n d i n g .............................................................................................. 466 12. T h r e e - E l e c t r o n B o n d s and I n c r e a s e d - v a l e n c e Structures for E x t e n d e d S i x - e l e c t r o n F o u r - c e n t r e B o n d i n g .............................. 469 13. T h r e e - e l e c t r o n B o n d s and I n c r e a s e d - v a l e n c e Structures for Cyclic Six-electron Four-centre B o n d i n g .................................. 473 14. T h r e e - e l e c t r o n B o n d s and C o v a l e n t - i o n i c R e s o n a n c e ..................... 475 15. Conclusions ........................................................................................ 477
Chapter 17. Valence Bond Description of ~-Electron Systems ............................ 481 Joseph Paldus and X. Li 1. 2. 3. 4.
I n t r o d u c t i o n ....................................................................................... 481 P P P - t y p e H a m i l t o n i a n s ..................................................................... 483 PPP-VB M o d e l .................................................................................. 486 A p p l i c a t i o n s ....................................................................................... 488 4.1. C o r r e l a t e d G r o u n d States, Basic T r a n s f e r a b i l i t y and The R o l e of Ionic Structures ............................................ 488 4.2. Spin P r o p e r t i e s ........................................................................ 490 4.3. Electron Delocalization, R e s o n a n c e and B o n d L e n g t h A l t e r n a t i o n .............................................................................. 492 4.4. Excited States ........................................................................... 493 4.5. VB C o r r e c t e d C o u p l e d C l u s t e r M e t h o d ................................. 493 4.6. Ionization Potentials and Electron Affinities .......................... 494 5. Conclusions ........................................................................................ 495
xiv
Chapter 18. The Spin-coupled Description of Aromatic, Antiaromatic and Nonaromatic Systems ........................................................................ 503 David L. Cooper, Joseph Gerratt and Mario Raimondi 1. 2. 3. 4. 5. 6.
I n t r o d u c t i o n ....................................................................................... S p i n - c o u p l e d W a v e f u n c t i o n s ............................................................ B e n z e n e .............................................................................................. C y c l o b u t a d i e n e .................................................................................. C y c l o o c t a t e t r a e n e ............................................................................... C o n c l u s i o n s ........................................................................................
503 505 507 511 514 514
Chapter 19. Aromaticity and Its Chemical Manifestations ................................. 519 Keneth B. Wiberg 1. H i s t o r i c a l P r e l u d e .............................................................................. 519 2. 3. 4. 5.
V a l e n c e B o n d vs. M o l e c u l a r O r b i t a l T h e o r y .................................... M a n i f e s t a t i o n s o f " A r o m a t i c " S t a b i l i z a t i o n ..................................... S i g m a C o n t r i b u t i o n to the G e o m e t r y o f B e n z e n e ............................ M a g n e t i c P r o p e r t i e s .......................................................................... 6. O r i g i n of the S t a b i l i z a t i o n o f B e n z e n e .............................................. 7. H e t e r o c y c l i c A r o m a t i c S y s t e m s ........................................................ 8. S u m m a r y ...........................................................................................
521 523 527 529 532 532 533
Chapter 20. Hypercoordinate Bonding to Main Group Elements: The Spin-coupled Point of V i e w ............................................................... 537 David L. Cooper, Joseph Gerratt and Mario Raimondi 1. I n t r o d u c t i o n ....................................................................................... 537 2. d - O r b i t a l P a r t i c i p a t i o n V e r s u s D e m o c r a c y ..................................... 538 3. H y p e r c o o r d i n a t e B o n d i n g to F i r s t - r o w A t o m s ................................ 543 3.1. 1.3-Dipoles ................................................................................ 543 3.2. O x o h a l i d e s of H y p e r c o o r d i n a t e N i t r o g e n and P h o s p h o r o u s 547 4. F u r t h e r E x a m p l e s .............................................................................. 548 4.1. O x o f l u o r i d e s of H y p e r c o o r d i n a t e S u l f u r ................................ 548 4.2. C h l o r i n e F l u o r i d e s and C h l o r i n e O x i d e F l u o r i d e s ................. 550 4.3. F l u o r o p h o s p h o r a n e s ................................................................ 550 4.4. Y X X Y D i h a l i d e s and D i h y d r i d e s of D i o x y g e n and Disulfur..551 5. C o n c l u s i o n s ........................................................................................ 551
Chapter 21. The Electronic Structure of Transition Metal Compounds ............. 555 Gernot Frenking, C. Boehme and U. Pidun 1. I n t r o d u c t i o n ....................................................................................... 555 2. C o m p u t a t i o n a l D e t a i l s ...................................................................... 556 3. R e s u l t s and D i s c u s s i o n ...................................................................... 558 3.1. C h e m i c a l B o n d i n g in [(CO)sW-A1CI(NH3)2] a n d [(CO)sW-A1C1] ........................................................................ 558 3.2. T h e Series [(CO)sW-XCI(NH3)2] (X = B,A1,Ga,In,T1) ........... 562 3.3. T h e series [ ( C O ) s W - Y ] (Y = [SiC12(NH3)], [A1CI(NH3)2], [Mg(NH3)3], [Na(NH3)3]) ......................................................... 565
XV
4. S u m m a r y and C o n c l u s i o n s ............................................................... 568
Chapter 22. Fundamental Features of Hydrogen Bonds ..................................... 571 Steve Scheiner 1. I n t r o d u c t i o n ....................................................................................... 571 2. 3. 4. 5.
H y d r o g e n B o n d i n g S t r e n g t h ............................................................. C o n t r i b u t i o n of E l e c t r o s t a t i c s ........................................................... R e l a t i o n s B e t w e e n V a r i o u s P r o p e r t i e s ............................................. C o o p e r a t i v i t y ..................................................................................... 5.1. G e o m e t r i e s ............................................................................... 5.2. E n e r g e t i c s ................................................................................. 5.3. V i b r a t i o n a l S p e c t r a ................................................................. 5.4. E n e r g y C o m p o n e n t s ................................................................ 6. S u m m a r y ...........................................................................................
572 574 578 581 582 584 585 586 589
Chapter 23. Molecular Similarity and Host-guest Interactions ........................... 593 Paul G. Mezey 1. 2. 3. 4.
I n t r o d u c t i o n ....................................................................................... 594 F r o m F u n c t i o n a l G r o u p s to E x t e n d e d M o l e c u l a r R e g i o n s .............. 596 E l e m e n t s of E l e c t r o n D e n s i t y S h a p e A n a l y s i s .................................. 600 E l e c t r o n D e n s i t y A n a l y s i s o f I s o l a t e d and I n t e r a c t i n g R e a c t i v e R e g i o n s of M o l e c u l e s .......................................................... 602 5. S h a p e S i m i l a r i t y M e a s u r e s in the S t u d y of H o s t - g u e s t I n t e r a c t i o n s ........................................................................................ 607 6. S u m m a r y ........................................................................................... 609
Chapter 24. Chemical Bonding in Proteins and Other Macromolecules ............ 613 Paul G. Mezey 1. I n t r o d u c t i o n ....................................................................................... 614 2. M a c r o m o l e c u l a r Q u a n t u m C h e m i s t r y B a s e d on A d d i t i v e F u z z y D e n s i t y F r a g m e n t a t i o n ( A F D F ) .............................. 616 3. " L o w D e n s i t y G l u e " ( L D G ) B o n d i n g in P r o t e i n s ............................. 624
Chapter 25. Models for Understanding and Predicting Protein Structure ......... 637 Dale F. Mierke 1. I n t r o d u c t i o n ....................................................................................... 2. M e t h o d s ............................................................................................. 2.1. H o m o l o g y M o d e l l i n g ............................................................... 2.2. S e c o n d a r y S t r u c t u r e P r e d i c t i o n .............................................. 2.3. P r i m a r y to T e r t i a r y P r e d i c t i o n ............................................... 2.4. E n e r g e t i c F o r c e F i e l d s ............................................................. 2.5. R e d u c e d A t o m R e p r e s e n t a t i o n ................................................ 2.6. R e d u c e d C o n f o r m a t i o n a l S p a c e / L a t t i c e M o d e l s .................... 3. C o n c l u s i o n s ........................................................................................
637 640 641 643 644 645 646 649 650
xvi
Chapter 26. Possible Sources of Error in the Computer Simulation of Protein Structures and Interactions ............................................. 655 J.M. Garcia de la Vega, J.M.R. Parker and Serafin Fraga 1. 2. 3. 4.
I n t r o d u c t i o n ....................................................................................... 655 Deficiencies of Potential E n e r g y Functions ...................................... 656 C o n f o r m a t i o n a l C h a r a c t e r i z a t i o n .................................................... 658 C o n c l u s i o n s ........................................................................................ 661
Chapter 27. The Nature of Van der Waals Bond ................................................. 665 Grzegorz Chalasinski, Malgorzata M. Szczesniak and Slawomir M. Cybulski 1. 2. 3. 4.
I n t r o d u c t i o n ......................................................................................... 66 5 F u n d a m e n t a l I n t e r a c t i o n E n e r g y C o m p o n e n t s ............................... 666 Ab Initio A p p r o a c h to I n t e r m o l e c u l a r F o r c e s ................................. 667 4.1. E x c h a n g e R e p u l s i o n versus M o l e c u l a r Shape ........................ 670 4.2. D i s p e r s i o n as the I n t e r m o n o m e r C o r r e l a t i o n Effect .............. 673 4.3. I n d u c t i o n , C h a r g e - t r a n s f e r and S C F D e f o r m a t i o n ............... 675 4.4. E x a m p l e 1. Ar-CO2: Dispersion B o u n d C o m p l e x .................. 676 4.5. E x a m p l e 2. W a t e r D i m e r : I n t r o d u c i n g E l e c t r o s t a t i c s ........... 679 4.6. G e n e r a l C o n s i d e r a t i o n s ........................................................... 682 5. M o d e l l i n g of PES and its C o m p o n e n t s ............................................. 682 5.1. A r - C O 2 ..................................................................................... 683 5.2. W a t e r D i m e r ............................................................................ 684 6. T r i m e r s and N o n a d d i t i v e Effects ...................................................... 687 6.1. A r 2 - C h r o m o p h o r e Clusters: E x c h a n g e and D i s p e r s i o n N o n a d d i t i v i t y ........................................................................... 688 6.2. W a t e r T r i m e r : I n d u c t i o n N o n a d d i t i v i t y ................................ 695 7. S u m m a r y ........................................................................................... 696
Chapter 28. The Nature of the Chemical Bond in Metals, Alloys, and Intermetallic Compounds According to Linus Pauling ................... 701 Zelek S. Herman 1. 2. 3. 4.
I n t r o d u c t i o n ....................................................................................... 701 Q u a n t u m M e c h a n i c s and the N a t u r e of M e t a l s ............................... 703 T h e M e t a l l i c Orbital .......................................................................... 705 The D e t a i l e d A n a l y s i s of the Statistical T h e o r y of U n s y n c h r o n i z e d R e s o n a n c e of C o v a l e n t Bonds ............................... 710 5. C a l c u l a t i o n of the N u m b e r of M e t a l l i c Orbitals per A t o m from the Statistical T h e o r y of the U n s y n c h r o n i z e d R e s o n a n c e of C o v a l e n t Bonds ........................................................... 715 6. T h e C r y s t a l S t r u c t u r e s of the M e t a l s and the M a x i m u m Values of the metallic V a l e n c e .......................................................... 718 7. The C o m p i l a t i o n of M e t a l l i c S i n g l e - b o n d Radii and Radii for L i g a n c y 12 .................................................................................... 722
xvii
8. The Structure and P r o p e r t i e s of E l e m e n t a l Boron. Is it a M e t a l ? ............................................................................................. 724 9. The Nature of the M e t a l - M e t a l B o n d in Alloys, I n t e r m e t a l l i c C o m p o u n d s , and on the Surfaces of Alloys ...................................... 726 10. S u p e r c o n d u c t i v i t y I n t e r p r e t e d in T e r m s of the U n s y n c h r o n i z e d - r e s o n a t i n g - c o v a l e n t - b o n d T h e o r y of Metals ........ 732 11. Conclusions ........................................................................................ 738
Epilogue:
Linus Pauling, Quintessential Chemist ............................................ 749 Dudley Herschbach
Index
............................................................................................................
755
This Page Intentionally Left Blank
xix PROLOGUE T h e C h e m i c a l B o n d o n t h e E v e of t h e 21st C e n t u r y
Zvonimir B. Maksi5 and W.J. Orville-Thomas Linus Pauling is rated as the most prominent American scientist and the greatest chemist of this century. To some people he was Moses who has led chemists to the promised land, whereas others imagined him as the mythical Prometheus who brought quantum mechanical fire to classical chemistry. One thing is beyond doubt - nobody made so many important discoveries in so many different branches of chemistry and related disciplines as Linus Pauling. As Dudley Herschbach put it at the end of this book, he was the quintessential chemist. The younger generation considered him, like his grandson Alexander Kamb, as a force of nature. The latter is also the title of an excellent biography skilfully written by Thomas Hager [1]. This book is dedicated to Pauling and his work focusing on the chemical bond. It is, therefore, appropriate to begin with Pauling's own words: "The concept of the chemical bond is the most valuable concept in chemistry. Its development over the past 150 years has been one of the greatest triumphs of the human intellect. I doubt that there is a chemist in the world who does not use it in his or her thinking. Much of modern science and technology has developed because of the existance of this concept" [2]. This is perfectly true: the chemical bond is one of the three most important cornerstones of classical chemistry, together with the notion of atoms in chemical environments and the idea of molecular structure and geometry. The latter reflects a multitude of properties of molecules stored in their structural parameters, size, shape and symmetry. These classical pillars received a proper interpretation and physical meaning from quantum mechanics with one notable exception - molecular structure - which still poses a problem not rigorously solved as yet from first principles. Many researchers have contributed to this remarkable progress over many decades, and one could characterize the development of quantum chemistry as a permanent crawling revolution in molecular sciences particularly taking into account recent advances in computational chemistry. Linus Pauling was, however, the pioneer and champion of quantum chemistry in the pre-computer era. It took a genius and a vivid imagination to tackle intricate and perplexing chemical problems by using a slide rule and to make tremendous leaps in understanding chemical bonding, which has substantionally contributed to dramatic growth in the life sciences that we have witnessed in recent years. By using his astonishing ability to reduce the complex to the simple, Pauling shed light on the architecture of molecules and crystals. He explained the directional properties of covalent bonds in an elegant way by introducing polarized local hybrid (chemical) orbitals and inaugurated the concept of resonance within the classical valence bond (VB) theory, which in turn is undergoing a remarkable renaissance. Pauling was the first to establish a quantitative electronegativity scale thus enabling a simple description of charge distributions in molecules and providing a rationalization of the ionic component of chemical bonding. Combining resonance with the electrostatic interactions, Pauling discovered the important role of hydrogen bonding in determining weak intra- and intermolecular interactions. These interactions proved crucial in understanding essential features of molecules
XX
of life to mention only proteins. His work on the nature of the peptide bond and on the structural patterns of proteins in terms of alpha helices and beta-pleated sheets are milestones in the development of biochemistry and molecular biology. Instead of listing all discoveries and work which stimulated others to unravel the secrets of Nature - to single out only the Crick-Watson model of DNA as an enlightening example - we shall succintly say that he erected a more lasting scientific monument than those made of brass or stone. Pauling was a grand master of modelling in science. His models were very simple, reflecting the quintessence of a phenomenon or property under scrutiny and satisfying the Occam razor principle at the same time. They provide a qualitative understanding of the fundamental principles of chemical structure, bonding and reactivity, thus serving as a guide in the research process. These models are close to chemical intuition by building bridges between a rich chemical experience on one side and rigorous quantum mechanics on the other [3]. It should be stressed that Pauling's models did not only have a heuristic value, but also provided important semiquantitative information on a variety of molecular properties in the pre-computer age. They involve elementary, back of the envelope, calculations, illustrating in the highest sense of the word van't Hoff's statement that imagination and shrewed guess work are powerful instruments in acquiring scientific knowledge. The success of the Pauling's approach is best described by the Figure below, where the accuracy of theoretical models in reproducing a particular property of a very large compound or system of chemical interest is schematically plotted against the rigour of the applied theoretical procedures:
ca. true value
~
~
/
f
Pauling Ph.D. point point
Figure
~
rigour approaching full theory postdoctoral result
xxi This curve possesses several characteristic maxima and minima asymptotically approaching the exact value for the full theory. The first maximum corresponds to the Pauling point, where a simple and transparent physical model gives insight and reasonable agreement with experiment by focusing on the dominant effect(s) only. The quantitative description requires a much more sophisticated theoretical approach and meticulous calculations. It should be emphasized, however, that simple conceptual models have led to great discoveries in the molecular sciences, which cracked some very important codes of nature, more frequently than the exact theories and detailed calculations. The latter usually came in the a posteriori stage to confirm that a bold hypothesis was correct. Not all of Pauling's models and concepts were new and original. For instance, the electronegativity idea dates back to Avogadro and Berzelius in the beginning of the 19th century [4]. However, he gave to many of them a deeper meaning and showed their chemical relevance by utilizing his encyclopedic knowledge. His views on chemical bonding were summarized in a superb landmark book "The Nature of the Chemical Bond" [5], which inspired generations of chemists. It is frequently cited as one of the most influential scientific books of our century. This is not surprising because well established models provide in general a scientific vocabulary and lend themselves to classification purposes. They give a pervasive physical insight and extract the key features of very complex phenomena, thus revealing their essence and simplicity. It should be stressed that reliable models possess a grain of truth, which is not always realised. They are true within the limits of the approximations involved and within a carefully determined range of applicability - no more but, at the same time, no less. Metaphorically speaking, models extend the range of our senses and make it possible to "see" mentally what cannot be seen [6]. In the meantime, breakthroughs in tackling molecular many-body problems by computational quantum chemistry based on new theoretical schemes, novel numerical methods and the dramatically fast development of computer technology made possible quantitative description of versatile chemical bonding phenomena comparable to that offered by experiments. It is timely to give a modern, present-day, theoretical description to many of the apparently successful Pauling models and seminal ideas and to present a refined interpretation of many subtle effects, which were not amenable to theoretical analysis earlier. Coverage of the recent advances in modelling of chemical bonding is therefore the main task of this book written by some of the most prominent experts in the field. Chapters on the molecular structure, geometries of fused aromatics and their electrophilic reactivity, bond energy, electronegativity, hybridization, aromaticity, contemporary VB methods, hydrogen bonding and the structure of the proteins and other large biological compounds reflect much of the leading current thinking. They are prepared by carefully avoiding dangers of the Scylla of intricacy and the Charybdis of oversimplification and by putting a considerable emphasis on the interpretation of theoretical results. It can be stated safely that most of Pauling's models have stood the test of time and found rigorous justification. However, it should be pointed out strongly that many authors, by building on Pauling's ideas and by standing on his shoulders, overcame the limitations of the old, crude and sometimes fully empirical models by making bold steps forward thus expanding the frontiers of molecular sciences. Although such a book is never complete, it is our belief and hope that it will contribute to better understanding of the ubiquitous chemical bond and become an indispensable textbook for post-
xxii graduate/doctoral students in physical and advanced physical organic chemistry. It is important to point out in this connection that quantum chemistry - the Holy Grail of molecular sciences - will have an ever-increasing role in the 21st century particularly in establishing strong links between chemistry and molecular biology and thus featuring as a unifying methodology. Finally, we would like to use this opportunity to thank all authors for their scholarly written and intellectually stimulating chapters, which made this book possible. REFERENCES 1. 2. 3.
4. 5. 6.
T. Hager, Force of Nature - The Life of Linus Pauling, Simon & Schuster, New York, NY, 1995. L. Pauling, The Nature of the Chemical Bond - 1992, J. Chem. Ed., 69 (1992) 519. Z.B. Maksi(~, On the Significance of Theoretical Models of Chemical Bonding- Prologue to the Special Subject Issue on Conceptual Quantum Chemistry: Models and Applications, Part 1, Croat. Chem. Acta 57 (1984) No. 5. W.B. Jensen, Electronegativity from Avogadro to Pauling, J. Chem. Ed., 73 (1996) 11. L. Pauling, The Nature of the Chemical Bond and the Structure of Molecules and Crystals, Third Ed., Cornell University Press, 1960. Z.B. MaksiS, Modelling - A Search for Simplicity, in Theoretical Models of Chemical Bonding, Vol. 1, Z.B. MaksiS, Ed., Springer Verlag, Berlin- Heidelberg, 1990, p. 13.
Z.B. Maksid and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
THEORETICAL TREATISE ON MOLECULARSTRUCTURE AND GEOMETRY Jerzy Cios lowski Department of Chemistry and Supercomputer Computations Research I n s t i t u t e , ,
Florida St a t e University, Yallahassee, F l o r i d a 32306-3006, USA
1. INTRODUCTION: THE HIERARCHY OF MODELS IN CHEMISTRY The primary o b j e c t i v e of modern science f a i t h f u l l y describe the r e a l i t y . subject
to perpetual
tency.
When a p a r t i c u l a r
reasons.
is
to construct
theories
that
Theories are nothing but models that are
experimental
s c r u t i n y and checks of i n t e r n a l
theory is abandoned,
it
consis-
is for one of several
Some t h e o r i e s , such as those of f l o g i s t o n and c a l o r i c , are simply
proven wrong.
Others,
such as the g e o c e n t r i c theory of the unive rs e ,
superseded by simpler d e s c r i p t i o n s
of r e a l i t y .
Finally,
are
some formalisms
(such as the Newtonian mechanics) are found to possess only a limited val i d i t y or to c o n s t i t u t e special cases of more general ( u n i f i e d ) t h e o r i e s . In this r e s p e c t , chemistry does not d i f f e r from other sciences.
Contem-
porary chemical research is organized around a hierarchy of models that aid its practitioners phenomena.
in t h e i r everyday quest for the understanding of n a t u r a l
The building blocks of the language of chemistry, including the
representations
of molecules
in terms of s t r u c t u r a l
the very bottom of this h i e r a r c h y . as
reaction
etc.
[2],
types and mechanisms,
come next.
formulae
[1],
occupy
Various phenomenological models, such thermodynamics and chemical
Quantum chemistry, which at present
kinetics,
is the supreme
theory of e l e c t r o n i c s t r u c t u r e s of atoms and molecules, and thus of the ent i r e realm of chemical phenomena, resides at the very top.
This research was supported by the National Science Foundation under the grant CHE-9632706.
E-mail address:
[email protected], web page:
ht tp: //www. scr i. fsu. edu/~j erzy.
2
Being first formulated in the second half of the nineteenth century, the concept of molecular structure has evolved from a working hypothesis to the major tenet of chemistry by the time of the advent of modern quantum mechanics. At last, the new theory provided the means for predicting and explaining properties of atoms and molecules. However, with its description of matter that was (and still is) alien to many chemists conditioned by the experiences in the macroscopic world, the new theory stood little chance of displacing the existing models of chemical species and their transformations. Consequently, peculiar hybrid formalisms that invoke conventional chemical notions dressed up in the language of quantum mechanics have soon emerged. These formalisms, which are collectively known as the electronic structure theory, are in use t o this day. Modern electronic structure theory employs two levels of simplification. The use of various mathematical approximations is dictated by the limitations of computer hardware and the need for keeping the cost of quantumchemical calculations within reasonable limits. In contrast, the avoidance of quantum-mechanical treatment of nuclei i s deeply rooted in the aforementioned conceptual prejudices. While the severity of mathematical approximations is on a constant decrease thanks to the ever-increasing speed and availability of computers (note the gradual disappearance of semiempirical calculations from the chemical literature!), the validity of views that regard molecules as quasi-rigid assemblies of nuclei held together by electron clouds is rarely questioned by the majority of researchers. Twenty years have passed since the publication of the original paper by Woolley [3] in which the incompatibility of the molecular structure concept with the rigorous quantum-mechanical description of isolated molecules has been eloquently brought to the attention of chemists. The ensuing flurry of research publications clarified several misconceptions but did little to familiarize the broader scientific audience with this important issue. Regretfully, few quantum or computational chemists are aware of these papers, which are nowadays seldom discussed or quoted. This short treatise is intended to provide the reader with a concise summary of the current theoretical status of the molecular structure and geometry concepts. A fully quantum-mechanical treatment of molecules is employed where necessary. The relevance of stationary states of isolated molecules is discussed and the notion of molecular geometry is contrasted with that of molecular structure.
3 2 . MOLECULAR WAVEFUNCl'IONS A nonrelativistic description of molecules is provided by the well-known
Hamiltonian, A
h
A
H = T n + Te where
A
t V
,
A Tn = (-1/2)1 mI- 1 V,2
,
I
Te = (-1/2)1 Vi2 A
1
and
(2)
,
(3)
In Eqs.(2-4), {m,} and {Z,} are the vectors of nuclear masses and charges, respectively; whereas R = {RI} and r = {ri} stand for the positions of nuclei and electrons. The atomic units are used throughout the text. A s the consequence of its translational invariance, H possesses infinitely many eigenstates {Y (r,R)} and its spectrum is continuous. The PN entire set {Y (r,R)} can be readily reconstructed from a finite manifold PN of eigenstates that correspond to zero linear momentum p,
(5) PN(r,R) = exp (ip-RCM ) YON (r-RCM,R-RCM) , is the position vector of the center of mass. In the following, where both r and R always refer to vectors relative to RcM, and YON (r-RCM,R-RCM) (the spectroscopic s t a t e s [4]) are denoted simply by YN(r,R). One should note that the removal of three degrees of freedom from (r,R) makes the spectroscopic states normalizable. Y
RCM
A
The invariance of H under other transformations of the coordinate system [5] imposes certain symmetries upon the elements of the set {YN(r,R)}. In particular, rotational invariance implies that these states must be eigenfunctions of the total angular momentum operators, "2 J YN(r,R) = J(J+l) YN(r,R) , AJz YN(r,R) = M YN(r,R) . (6)
A
In addition, thanks to the space-inversion symmetry of H, spectroscopic states must possess definite parities, YN(-r,-R)
=
Il YN(r,R)
,
Il
E
{-l,l}
.
Consequently, spectroscopic states can be labeled {YNJMn(r,R)}.
(7)
^
Although H is n o n r e l a t i v i s t i c , in the r e a l i z a b l e e i g e n s t a t e s
relativistic
e f f e c t s manifest
{~NJM~(r,R)} through permutational
themselves symmetry.
Spectroscopic s t a t e s must be t o t a l l y antisymmetric under the permutation of the labels of any two e l e c t r o n s . with respect
In a d d i t i o n ,
total antisymmetry/symmetry
to the permutations of the labels of fermionic/bosonic nuclei
must be exhibited.
For this reason, the set {~NJMH(r,R)} is not determined
by {mi,Zi} alone but by {mI,ZI,SI}, where {S I} is the set of nuclear spins. Let ~q(r,R) be an a r b i t r a r y t r i a l wavefunction expressed in terms of r and R r e l a t i v e to RCM. The preceding c o n s i d e r a t i o n s lead to the conclusion that the symmetry-adapted ~q(r,R) must be of the form ~JM~ (r'R) = An Ae f WjM(fl) [~(Ur,IJR) + I q~q(-Ur,-IJR)] dfl A
,
(8)
^
where An is the nuclear antisymmetrizer,
symmetrizer/antisymmetrizer,
WjM(~) is
the appropriate
weight
Ae is the e l e c t r o n i c function of
the Euler
angles ~ ~ (@,0,~) [6], and U ~ O(fl) is given by I cos~osOcosr 0 --
r
cos ycos Os i n t i s i n~costp
-cosTsinO
-sin}'cosOcostp-cosysintp
-sin),cosOsintp+cosTcostp
sin),sinO
s i nOcosr
s inOs inr
.
(9)
cosO
3. DECOUPLINGOF NUCLEAR AND ELECTRONIC DEGRF~ OF FREEDOM The e l e c t r o n i c Hamiltonian, A
He^ - "Fe + V
,
(10)
is parameterized by the nuclear coordinates R.
Its eigenstates
{~N(rIR)}
and the corresponding eigenenergies {eN(R)} transform as follows upon rotations and space inversion of R: VtN(UrIUR) = WN(rIR)
,
eN(IJR) - eN(R)
,
(11)
~N(-rl-a) - WN(rlR)
,
es(-R ) - eN(a )
.
(12)
It ensues from the property (11) that i t is s u f f i c i e n t
to
define {1/tN(rla ) }
and {eN(a)} only within the domain of internal nuclear coordinates a. replacement of R by R = {RI}, where RI = {xi,Yi,Zi},
which r e s u l t s
The
in the
removal of three degrees of freedom (two for linear molecules), corresponds to adopting a r o t a t i n g
("body-fixed")
fixed ("space-fixed")
one.
system are p o s s i b l e ,
the most natural
coordinate
Various d e f i n i t i o n s
system in place of
the
of the former coordinate
involving the requirement
that
the
5
tensor of inertia,
B
2: mIaIBI
, a,B E { X , Y , Z ) , (13) I is diagonal. The ground eigenstate yo(rlR) can be employed as a continuous basis set for the trial wavefunction %(r,R),
afi
=
$(rN
=
I IV&IR)
.
dR
Symmetry adaptation [Eq.(8)] of such %(r,R)
JMn(
?R)
produces
=
fin II wjM(~) [ V ~ W ~ I %(URIR) R)
+
n
yo(Url-R) %(-URIR)I
d~ d~
(15)
where Eq.(12) has been used and proper electronic antisymmetry of yo(rIR) has been assumed. The simplest choice for the function %(R~R) is provided by the Born-Oppenheimer ("clamped nuclei") approximation [7]
%(R[R)
=
6(R-R) ~ ( R - R ~ ) ,
(16)
W - R )
(17)
where R N i s the position of one of the minima of E ~ ( R ) . It is important to note that such a choice of %(R~R) leads to %JMn(r,R) that cannot be an eigenstate of the molecular Hamiltonian with finite nuclear masses. This problem is rectified in the adiabatic approximation [8], in which
q.J(RIR)=
?
and thus
where R is such that U(R) R coincides with R. Both the Born-Oppenheimer and adiabatic approximat ions decouple the nuclear degrees of freedom from the electronic ones. This coupling is restored in more sophisticated representat ions of %(R~R) that employ the generator coordinate method (GCM) [6,9]. Such representations, which are capable of correctly describing the instantaneous following of electronic motions by nuclei, provide partial justification for the approximate separation of electronic, vibrational, and rotational degrees of freedom that is commonly invoked in theories of molecular spectra [6]. However, the simple representation (18) suffices for the purpose of discussing the concepts of molecular geometry and structure.
4. THE RELEVANCE OF SPECTROSCOPIC STATES Since, as mentioned in Section 2, the spectroscopic s t a t e s
{~NjMH(r,R)}
are f u l l y determined by the set {mI,ZI,SI}, isomeric systems with i d e n t i c a l isotopic compositions share
the same molecular Hamiltonian [10],
implying
that {~NJMH(r,R)} represent s t a t i o n a r y s t a t e s of i s o l a t e d systems of nuclei and e l e c t r o n s r a t h e r than i d e n t i f i a b l e molecules. to question species~
the relevance of such s t a t e s
Hence, it is l e g i t i m a t e
to the d e s c r i p t i o n
of chemical
A
Although H has a continuous
spectrum,
its
subspectrum {ENJM~} that
a s s o c i a t e d with {~NJM~(r,R)} is mostly d i s c r e t e . {ENJMI} is
characterized
by several
energy
is
The d i s c r e t e p o r t i o n of
scales.
Energy levels
i d e n t i c a l N, J, and M but d i f f e r e n t p a r i t i e s H are c l o s e l y spaced.
with
For ex-
ample, the s p l i t t i n g AE = IEooo,+I-Eo00,_ll of ca. 4 [#Hartree] (which is equivalent to h/&E = 40 [ps] and AE/k = 1 [K]) in NH3 is considered q u i t e large [11];
the value of h/AE for AsH3 that
being probably more typical identical [#Iartree]
N but
[12].
different
angular
is estimated at ca. 3000 [h]
The spacing between energy levels with momenta J
is of
the order
of
10-100
for small diatomics and decreases very quickly with increasing
molecular size.
Finally,
the spacing between energy l e v e l s with d i f f e r e n t
N can be as small as 1 [mHartree] and as large as 200 [mHartree]. Whether or not two e i g e n s t a t e s of a given Hamiltonian can be i n d i v i d u a l ly observed depends on t h e i r energy d i f f e r e n c e . ious f a c t o r s
In the following, the var-
that determine the o b s e r v a b i l i t y of spectroscopic s t a t e s
(and
thus t h e i r relevance) are discussed.
4.1. Time dependence Time scales of many experiments are short enough to make the issue of the s t a t i o n a r y c h a r a c t e r of the observed s t a t e s
totally
irrelevant.
For
example, consider two spectroscopic s t a t e s ~+ and ~_, ~+ - ~NJM,+I
and
~ _ - ~NJM,- 1
whose energies d i f f e r by AE.
(19)
'
Being n o n s t a t i o n a r y ,
the m i x e d - p a r i t y s t a t e s
~R and qJL' ~R = 2-1/2 (~+ + ~_) are subject that
,
(20)
to quantum beats with the period of T -
h/AE (one should note
the r a d i a t i o n a l
and
q~L = 2-1/2 (~+ _ ~_)
decay can be s a f e l y neglected h e r e ) .
Consequently,
when a mixed-parity state YR is prepared, i t may evolve very little in the course of the experiment, provided that AE is sufficiently small. Considerations of this nature give rise to the notion of the so-called feasible symmetry operations [ 5 ] . From a chemist's point of view, space inversion is a feasible symmetry operation for NH3, as the lifetime of the mixedparity states of this molecule is only ca. 40 [ps] (see above). On the other hand, AsH3 may stay in one of its mixed-parity states for several months. Thus, the description of AsH3 in terms of the mixed-parity states is as legitimate as that in terms of spectroscopic states with definite pari ties. Similar observations pertain to the phenomenon of optical activity [13,14]. The fact that spectroscopic states cannot exhibit such activity is known as the Hund paradox [13]. However, once prepared, the enantiomers that are described by the wavefunctions IR and YL can persist for very long times thanks to the extreme smallness of the energy splittings AE in these cases (AE as small as 10- 50 [au], which corresponds to h/AE of the order of [years] [15]) 4.2. Interactions with external fields The spectroscopic states are eigenfunctions of the unperturbed molecular Hamiltonian. The behavior of these states and their energies upon weak external perturbations is governed by perturbation theory. However, when matrix elements of the perturbation operator between the spectroscopic states are much greater than the corresponding energy spli ttings, {YNJMn(r,R)} have to be replaced by functions that form a basis in which the perturbation operator is diagonal. Such basis functions, which are linear combinations of the spectroscopic states, possess reduced symmetries. An example involving a molecule interacting with a homogeneous electric field of strength & helps clarify the above statement [16]. Due to their definite parities, spectroscopic states cannot possess permanent dipole moments. When the field is weak enough, spectroscopic states retain their identities to the first degree of approximation, and the changes in their energies are quadratic in & - the second-order Stark effect is observed (the fact that the perturbation series is an asymptotic expansion is ignored here for the sake of simplicity). However, once the magnitude of interaction with the electric field exceeds the spacing between the adjacent
energy levels of the unperturbed Hamiltonian, localized
states
is more appropriate
the use of broken-symmetry
than that of {~NJMH(r,R)} [17].
The
l o c a l i z e d s t a t e s , which describe the molecule under c o n s i d e r a t i o n with i t s dipole moment oriented along the d i r e c t i o n of e l e c t r i c f i e l d have energies that change l i n e a r l y with s - a phenomenon known as the f i r s t - o r d e r Stark effect.
In systems such as AsH3 or CH3F, the s p l i t t i n g
between energy
levels of d i f f e r e n t p a r i t y is so small that the f i r s t - o r d e r Stark e f f e c t is always observed in p r a c t i c e . symmetric-top rotors"
In the conventional treatment of these " r i g i d
[11],
this phenomenon is a t t r i b u t e d
of a permanent dipole moment.
to the presence
However, in light of the present d i s c u s s i o n ,
this dipole moment is not a property of the molecule alone but is acquired through i n t e r a c t i o n with the external f i e l d . In the "nonrigid symmetric-top r o t o r s " Stark
effect
is
observed
under
(such as NH3), the second-order
normal
circumstances.
Indeed,
field
strengths of the order of 1 600 000 [V/m] are required to bring the i n t e r action
into the f i r s t - o r d e r
regime in this case [18].
In c o n t r a s t ,
very
weak i n t e r a c t i o n s s u f f i c e to make the mixed-parity s t a t e s ~R and ~L appropriate neutral
for the d e s c r i p t i o n of o p t i c a l l y a c t i v e systems. currents
have been proposed as
molecular Hamiltonian enantiomers [14,19].
[Eq.(1)] At p r e s e n t ,
that
is
the
interaction
responsible
for
this hypothesis is s t i l l
Parity-violating missing from the the e x i s t e n c e
of
awaiting experi-
mental v e r i f i c a t i o n . External p e r t u r b a t i o n s can mix not only spectroscopic s t a t e s of d i f f e r ent p a r i t i e s but also those of d i f f e r e n t
t ota l angular momenta.
If the in-
t e r a c t i o n s are s u f f i c i e n t l y strong, almost complete l o c a l i z a t i o n of nuclei may be achieved, However, it
y i e l d i n g molecular
states
with well defined geometries.
is important to emphasize that no n o n r e l a t i v i s t i c p e r t u r b a t i o n
is capable of breaking the nuclear antisymmetry/symmetry of the molecular wavefunction. 4.3.
Intermolecular
The fact
that
interactions
s t a t e s with d e f i n i t e p a r i t i e s
are almost never observed
experimentally cannot be explained alone by the p e r s i s t e n c e of mixed-parity states.
Obviously, symmetry-breaking phenomena are in operation whether or
not external
perturbations
are p r e s e n t .
For species
in condensed media,
symmetry breaking is brought about by intermolecular i n t e r a c t i o n s . extreme case of s o l i d s ,
these i n t e r a c t i o n s are so strong that it
In the is proper
9
to describe molecules with the localized states discussed in Section 4.2 of this treatise. However, one should realize that even very weak intermolecular interactions suffice to effect symmetry breaking in cases where energy levels of the molecular Hamil tonian are closely spaced. These interactions do not have to break symmetry themselves as long as they introduce nonlinearities in the Hamiltonian 112,201. Consider a system that can exist in two spectroscopic states Y+ and B [Eq.(19)]. Quantum evolution of this system under the following nonlinear Hamiltonian, A
h
HI = H +
a
v
A
A
(211
,
where measures the strength of the intermolecular interaction, affords a useful model of symmetry-breaking phenomena 1211. Hamiltonians of this type describe interactions of molecules with solvents within the dielectric continuum models 1221. Since the states B+ and B are eigenfunctions of the unperturbed molecular Hamiltonian, A
HB+=E+B+
,
A
H B = E- B -
(22)
,
the matrix elements of H for the mixed-parity states YR and YL [Eq.(20)] are given by A
A
A
=
=
A
A
(1/2)(E+
+E
)
=
= (1/2) (E+ - E )
=
=
A
A
Eo
(23)
,
(1/2) AE
.
(24)
If V is an odd operator such as p,
pp+>=
A
IVIY >
=
0
,
then for real-valued IR and YL, A
= - < B LI $ I B L> = V
where
v
=
A
A
=
and
A
= < B R I V I Y L > = O
,
.
The evolution of the system in question is described by the equation
m(t)/at
=
-
i
Y(t)
.
(26)
(27)
(28)
An exact time-dependent representation of B(t) is given by the expression B(t)
=
cR(t) exp (-iEot) BR
+ c,(t)
exp (-iEot) YL
,
which leads to the following system of coupled differential equations
(29)
10
where fl=21V2/AE
,
is a scaled coupling constant. quantities P(t), D(t), and S(t),
With the introduction of the real-valued
Eqs.(30) and (31) can be turned into
aP(t)/at
=
AE D(t)
,
AE P(t) [fl S(t)
&(t)/dt
=
dS(t)/dt
= -
fl AE D(t) P(t)
-
I]
.
,
(36)
Eqs.(36) are readily integrable. If at t = 0 the system is in the state described by YR, then P(0) = 1 , D(0) = S ( 0 ) = 0 , and P(t) = cos q , where q is the solution of the following equation t
=
3 [I-
0
. 2 -112dx (8/2)2 sin x]
(37)
If fl < 2, the system undergoes oscillations between the states YR and YL. The period of these oscillations is given by 0
i.e. is always longer than in the isolated-molecule case and becomes infinite as fi approaches the critical value of 2. If fl 1 2, the system evolves aperiodically from YR to a mixed-parity state Yf, 1/2 Yf = Yt cos a t Y sin a , sin a = [(fl-l)/(2fl)] (39)
For large values of the coupling constant fl, Yf i s practically the same as YR. Similarly, provided fl is very large, a system prepared in the state YL remains in i t forever. Intermolecular interactions are capable of stabilizing broken-symmetry states in other ways. For example, contrary to intuitive expectations, random molecular collisions tend to preserve mixed-parity states [ 2 3 ] .
11
5. DESCRIPTION OF MOLECULAR PHENOMENA WITH SPECIROSCOPIC AND LOCALIZED
STATES
The conventional view of molecules as quasi-rigid assemblies of atoms represented by localized states is commonly used in descriptions of diverse physical phenomena. In many cases, such descriptions produce viable theories that are in consonance with experimental observations. However, careful analysis of these theories often reveals deficiencies that can be remedied only with a fully quantum-mechanical treatment employing spectroscopic states [16,24]. Two instructive examples that juxtapose such a rigorous treatment against the corresponding conventional approach are cited here in order to illustrate this important point. First, consider a very dilute gas of molecules [16]. The conventional theory of the static dielectric susceptibility x of such a gas invokes the notion of polarizable molecules with permanent dipole moments that are partially aligned by the external electric field E . Standard techniques of statistical thermodynamics produce the Langevin-Debye formula for x per molecule that reads
x
=
(pL/3kT)
+
CC
,
(40)
where p and a are the dipole moment and the average static polarizability of the molecule in question, and T is the gas temperature. This simple picture has to be abandoned once rigorous quantum-mechanical formulation is sought. Within such a formulation, x is given by the statistical average [25],
x
=
2
[I exp i
(-Ei/kT)]-'
1 1 c Y i ( ~ ~ Y f > . c Y f ~(Ef-Ei)-l ~ ~ Y i > exp i f#i
x { 1 - exp
[(Ei-Ef)/kT]}
,
(-Ei/kT)
(41
that involves the second-order perturbation theory for the spectroscopic state Y..1 In Eq.(41), the subscripts i and f are abbreviations for the respective sets of quantum numbers, i = NJMn and f = N'J'M'I1'. For a pair of states that share the same N, the energy difference Ef - Ei is much smaller than kT at room temperature (see Section 4). Consequently, the contribution from these pairs of states to x can be accurately approximated by
12
x1
=
(2/kT) [I exp (-Ei/kT)]-' 1 1 - exp (-Ei/kT) i i f>i
=
(I/kT) [! exp (-Ei/kT)]-' C exp (-Ei/kT)
=
"2IYo> (1/3kT)
A
1
A
1
,
(42)
where the factor of one-third comes from the summation over states with different M and Yo stands for the lowest-energy spectroscopic state. Conversely, as the energy difference Ef - Ei is usually much greater than kT for pairs of states that differ in N , their approximate contribution to can be written as
x
x2
exp (-Ei/kT)]-'1 1 -(Ef-Ei)-'exp (-Ei/kT) i f>i = a . (43) In other words, the conventional Langevin-Debye formula is recovered but with the expectation value of the square of the dipole moment operator in place of the square of the permanent dipole moment. Second, consider an experiment in which molecules are deflected by inhomogeneous electric field. According to the conventional description, the deflection is caused by the interaction between the field and the molecules represented by material points with permanent dipole moments. On the other hand, a rigorous quantum-mechanical treatment of this phenomenon cal Is for the consideration of the coupling between the overall motion of the centerof-mass and the perturbed spectroscopic states [24]. This coupling arises as a consequence of the fact that the total momenta of molecules are not conserved in the course of such an experiment. Molecules for which a temperature-dependent dielectric susceptibility i s observed in gas phase are commonly called polar. Polar molecules have microwave spectra with transitions corresponding to AJ=+l and are deflected by inhomogeneous electric fields. In the conventional approach, these phenomena are attributed to the presence of permanent dipole moments in such molecules. In contrast, the notion of permanent dipole moments (which are zero for spectroscopic states) plays no role at all in the fully quantummechanical treatment out 1 ined above. The temperature-dependent component of x arises from the existence of low-lying spectroscopic states for which =
2
[T 1
A
the transition matrix elements do not vanish [Eq.(42)].
Transi-
13 tions between these s t a t e s are responsible for the observed microwave spectra.
Coupling to these s t a t e s also explains the d e f l e c t i o n by inhomogene-
ous e l e c t r i c f i e l d s . The conventional d e s c r i p t i o n of molecules, which is obviously much more intuitive
and s t r a i g h t f o r w a r d
often adequate.
than i t s quantum-mechanical counterpart,
is
the manifestations of quantum e f f e c t s
are
Nevertheless,
e a s i l y d e t e c t a b l e experimentally.
For example, species such as HC~CD, HD,
or CH3D, which are c l e a r l y nonpolar by the conventional d e f i n i t i o n , do possess temperature-dependent X and observable microwave s p e c t r a , flect
in inhomogeneous e l e c t r i c
upon the conventional approach,
fields
[11,16].
In f a c t ,
and do de-
if one i n s i s t s
these observations can be c o n s i s t e n t l y ac-
counted for by assuming the presence of small (of the order of 0.01 [D]) permanent dipole moments in these molecules.
However, a rigorous quantum-
mechanical treatment of such cases is c l e a r l y p r e f e r a b l e .
6. THE CONCEPT OF MOLECULARGEOMETRY When nuclei
in molecules are t r e a t e d c l a s s i c a l l y ,
ular geometry emerges in a natural way.
the concept of molec-
To be more p r e c i s e ,
the e q u i l i b -
rium geometry is defined as the set of i n t e r n a l coordinates R for which the ground-state eigenvalue eo(R ) of the e l e c t r o n i c Hamiltonian a t t a i n s a local minimum. Different minima in eO(R) correspond to equilibrium geometries of isomeric species with i d e n t i c a l compositions.
Needless to say, this naive
p i c t u r e is i n e v i t a b l y lost in a f u l l y quantum-mechanical treatment. Molecular geometries measured with condensed-phase
techniques
such as
X-ray d i f f r a c t i o n or NMR cannot be regarded as inherent p r o p e r t i e s of isolated species.
Similarly,
as the "determination" of molecular geometries
from microwave spectra involves c o l l a t i o n of data p e r t a i n i n g to many spectroscopic s t a t e s of species d i f f e r i n g in isotopic compositions, such geomet r i e s are merely c o l l e c t i o n s of f i t t i n g parameters that cannot be viewed as quantum-mechanical observables. All the information about p o s i t i o n s of nuclei that can be obtained for a given spectroscopic s t a t e ~NJMI(r,R) is contained in the corresponding nuclear p r o b a b i l i t y d e n s i t y PNJMI(R),
PNJMI(R) = f ~NJMI(r,R) ~NJMI(r,R) dr
.
(44)
14
It is obvious that the point and permutational symmetries of YNJMn(r,R) are reflected in PNJMn(R). The presence of these symmetries precludes the re-
trieval of the "classical-type" molecular structures [18] (geometrical parameters describing quasi-rigid assemblies of atoms that undergo vibrations and rotations) from the wavefunctions of spectroscopic states. This point is nicely illustrated by an example involving the simple adiabatic approximation [Eq.(18)] to YNJMn(r,R). In molecules that are not fluxional, %(R) is narrowly peaked around R that corresponds to the equilibrium geometry. It is often mistakenly believed that this fact implies localization of nuclei and thus the validity of the classical concept of molecular structure. However, a closer inspection of Eqs.(l8) and (44) reveals that this is ceris not tainly not true. First of all, PNJMn(R) derived from such %(R) peaked around any particular R because of the angular averaging brought about by the integration over the Euler angles Q. Second, thanks to the presence of the nuclear symmetrizer/antisymmetrizer, PNJMn(R) is fully symmetrical with respect to interchange of coordinates of any two nuclei of the same type. Consider, for example, a molecule with the composition AB2 [ 4 ] . Clearly, only one value of the internuclear separation NJMn,
NJMn
IRA-RBI (yNJMn(r'R)'
PNJMfl(R) IRA-RBI dR
3
(45)
can be extracted from YNJMn(r,R), whether or not the two "bond lengths" RAB in AB2 are identical. This lack of distinct internuclear separations carries over to localized states, as the permutational symmetry of nuclei persists under all nonrelativistic perturbations. These complications notwithstanding, molecular geometries can be derived solely from wavefunctions of spectroscopic states (or even nonstationary states with zero linear momentum) under favorable conditions. This can be accomplished in principle [21 J by constructing the functions
{tB}
for all pairs of nuclear types A and B, and locating the positions of their maxima. Cartesian norms of these position vectors can be regarded as "bond lengths" {RAB} from which the molecular geometry in question can be reconstructed by triangulation, provided a sufficient number of the maxima can be found. Such molecular geometry, which is essentially the entity afforded by electron di ffrac t ion experiments , corresponds to the "potent ial molecular structure" [ 1 8 ] . The fact that distinct "bond lengths" can be
15
extracted after all comes as no surprise in light of the rich topologies exhibited by electron intracule densities [26], which are analogous to
INJMII(~AB).
Several points concerning the above prescription should be emphasized. First of all, i t is an arbitrary construction that is not derivable from the postulates of quantum mechanics. Second, since the presence of a sufficient number of the aforementioned maxima cannot be guaranteed in general, this prescription is by no means universal. Third, since atoms and molecules have infinite extends, similar considerations cannot be employed in a definition of molecular shape. In summary, although isolated molecules possess neither classical structures nor shapes [3], their geometries can be defined under certain conditions. 7. THE CONCEPT OF MOLECULAR STRUCTURE
Within the conventional approach to description of chemical species, molecular structure is usually construed somewhat vaguely as a union (in the set-theoretical sense) of atoms and their connectivities through bonds. A rigorous definition of molecular structure can be readily arrived at by observing that distinct molecular geometries may correspond to the same molecular structure. In other words, from the mathematical point of view, molecular structures are simply equivalence classes of molecular geometries [27-291. How these equivalence classes are defined is a matter of choice. For example, they may be associated with catchment regions on the potential energy hypersurfaces [30]. A fully quantum-mechanical treatment of nuclei is incompatible with the conventional notion of molecular structure. Nevertheless, i t is possible to assign molecular structures to both spectroscopic and localized states, and even to nonstationary states with zero linear momentum. Such an a s signment proceeds through the following steps [21]: 1 . Construction of the electron density p(ZIR,t), p(ZIR,t)
=
1
Y*(r,R,t) Y(r,R,t) F(ri-Z)dr
,
(47)
where Y(r,R,t) is the (time-dependent) wavefunction of the state in question. 2. Topological analysis of p(ZIR,t): for a given R and t , all the points in p(ZIR,t) that are extremal with respect to the three-dimensional vector Z
16 are located and the molecular graph G(R,t) is c o n s t r u c t e d [27,28]. 3. I d e n t i f i c a t i o n of a l l that
the molecular s t r u c t u r e s compatible with ~ ( r , R , t ) ,
is c o n s t r u c t i o n of a set
{G(R,t)}.
{Qi } of a l l d i s t i n c t
equivalence c l a s s e s of
Two molecular graphs belong to the same equivalence c l a s s Qi i f
and only if they are homeomorphic [27,28].
The set {Qi } is f i n i t e ,
provid-
ed that the system under c o n s i d e r a t i o n c o n s i s t s of a f i n i t e number of part i c l e s (nuclei and e l e c t r o n s ) . 4. C a l c u l a t i o n of the p r o b a b i l i t i e s P ( Q i , t ) , P(~i,t) =
~
P(R,t) dR
,
(48)
R: ~(R,t)e~i where P(R,t)
is the nuclear p r o b a b i l i t y d e n s i t y
[Eq.(44)]
that corresponds
to ~ ( r , R , t ) . The q u a n t i t y P ( ~ i , t )
constitutes
the p r o b a b i l i t y of the system in ques-
tion being at the time t in a quantum-mechanical s t a t e compatible with the s t r u c t u r e Qi"
Obviously, ~i with the l a r g e s t P ( Q i , t )
is the dominant mo-
l e c u l a r s t r u c t u r e or simply the molecular s t r u c t u r e at that moment of time. Such a d e f i n i t i o n ,
aspects of which have been considered before [28,29],
f u l l y quantum-mechanical
in nature and does not r e l y on any p a r t i c u l a r
is ap-
proximation to ~ ( r , R , t ) .
8. CONCLUDINGREMARKS The long-standing d i s p u t e concerning the concepts of molecular geometry, structure,
and shape c o n s t i t u t e s
a classical
example of a d i s c o r d between
the fundamentalist and pragmatic points of view. the p u r i s t ' s
stance,
To a s c i e n t i s t who adopts
these concepts are simply a r t i f a c t s
in the rigorous quantum-mechanical theory of m a t t e r .
that have no place
I s o l a t e d molecules in
s t a t i o n a r y s t a t e s have no shapes and possess no d i s c e r n i b l e geometries or structures,
at
least
in the conventional
notion of molecular s t r u c t u r e
sense [3].
The " c l a s s i c a l - t y p e "
[18] is not d e r i v a b l e from the p o s t u l a t e s of
quantum mechanics and has to be added i f usual d e s c r i p t i o n s of molecules in terms of atoms and bonds are sought [31]. molecular s t r u c t u r e definition"
[32].
is
"a concept
In other words, such a notion of
involving a convention,
Other concepts that r e l y on the "body-fixed" r a t h e r than
" s p a c e - f i x e d " view of molecules fare no b e t t e r . claims
an agreed upon
to the c o n t r a r y
[32,33],
For example, d e s p i t e some
the notion of p o t e n t i a l
energy hypersur-
faces cannot be c o n s i s t e n t l y extended beyond the Born-Oppenheimer approxi-
17
mation [34,35]. Such objections to the familiar concepts of chemistry create serious challenges for chemical educators [361. Spectroscopic states are extremely sensitive to the presence of external fields and to intermolecular interactions. In fact, spectroscopic states are so unstable that even minute perturba ions effect complete mixing of states with different parities [12,37,38]. As a consequence of this "flea on the elephant" [37] phenomenon, the loca ization of wavefunctions in the vicinity of the potential energy minima is a rule rather than an exception in the real world. Thus, from the pragmatist's point of view, spectroscopic states are of little relevance to most experimental situations as interactions with environment and/or external perturbations brought by the measurement process effectively mask the subtleties of the conceptually proper quantum-mechanical description of molecules. Hence, approaches such as the Born-Oppenheimer approximation that operate within the realm of the brokensymmetry localized states are perfectly acceptable as long as they afford sufficiently accurate predict ions of experimental data. Moreover, unlike the rigorous treatment, such approaches offer the benefits of simple explanations that appeal even to those who are not well-versed in the intricacies of quantum mechanics (i.e. to the majority of chemists). From this point of view, i t is quite immaterial whether, for example, the optical activity originates within a molecule or is acquired through its interaction with environment 1391, as long as i t can be accurately computed from a model that involves the molecule in question alone. In cases where the conventional description breaks down, one can always revert to sophisticated methodologies that go beyond the Born-Oppenheimer approximation [6,9,40]. Although i t is impossible to formulate a definition of molecular geometry that is fully quantum-mechanical in nature and at the same time universally applicable to all chemical species, topological analysis of the electron density leads to a rigorous statement of the dominant molecular structure for any state, spectroscopic or localized, stationary or time dependent, with zero angular momentum. In this sense, unlike geometry or shape, structure is an observable property of an isolated molecule. It is highly unlikely that the conventional view of molecules will relinquish its dominance in the near future. As long as chemists stay aware of the fact that such a view constitutes modeling of reality rather than objective truth, the description of chemical species in terms of intuitively appealing entities remains beneficial to the scientific progress. In a
sense, the discussion presented here para1 lels the dispute concerning the wisdom of analyzing electronic wavefunctions in terms of atomic and bond properties [ 4 1 ] . Such analysis inevitably hinges upon the augmentation of the postulates of quantum mechanics with definitions of operators for these properties. What compensates for the arbitrariness of such an augmentation and the introduction of "unphysical" quantities is the resulting simplicity of description. I t is this simplicity that makes the crude models of chemistry so appealing to so many scientists.
REFERENCES 1 . R. Hoffmann and P. Laszlo, Angew. Chem. Int. Ed. Engl. 30 (1991) 1. 2. C. Trindle, Croat. Chem. Acta 6 (1984) 1231. 3. R.G. Woolley, J . Am. Chem. SOC. 100 (1978) 1073. 4. R.G. Woolley, Isr. J . Chem. 19 (1980) 30. 5 . H.C. Longuet-Higgins,Mol. Phys. 6 (1963) 445. 6. L. Lathouwers and
7.
P. Van Leuven, Adv. Chem. Phys.
M. Born and R. Oppenheimer, Ann. Phys.
49 (1982) 115.
84 (1927) 457.
8 . M. Born and K. Huang, Dynarnical Theory of Crystal Lattices, Clarendon
Press, Oxford, 1954. 9. E. Deumens, Y. Ohrn, L. Lathouwers, and P. Van Leuven, J . Chem. Phys. 84 (1986) 3944.
L. Lathouwers, P. Van Leuven, and M. Bouten, Chem. Phys. Lett. 52 (1977) 439.
L. Lathouwers, Phys. Rev. A 18 (1978) 2150. E . Deumens, L. Lathouwers, P. Van Leuven, and Y. O h m , Int. J . Quant. Chem. Symp. 18 (1984) 339. L. Lathouwers and P. Van Leuven, Int. J . Quant. Chem. Symp. 12 (1978) 371.
L. Lathouwers and P. Van Leuven, Chem. Phys. Lett. 67 (1979) 436. E. Deumens, L. Lathouwers, and P. Van Leuven, Chem. Phys. Lett. 112 (1984) 341.
10. S. Aronowitz, Int. J . Quant. Chem. 14 (1978) 253.
1 1 . H.W. Kroto, Molecular Rotation Spectra, Dover Publ., N.York, 1992.
12.
P. Claverie and G. Jona-Lasinio, Phys. Rev. A
13. F. Hund, Z. Phys. 43 (1927) 805. 14. R.A. Harris and
L. Stodolsky, Phys. Lett. 78B
33 (1986) 2245. (1978) 313.
19 15. B.R. Fischer and P. Mittelstaedt, Phys. Lett. A 147 (1990) 411. 16. R.G. Woolley, Adv. Phys. 25 (1976) 27. 17. P.W. Anderson, Phys. Rev. 75 (1949) 1450.
18. P. Claverie and S. Diner, Isr. J. Chem. 19 (1980) 54. 19. S.F. Mason and G.E. Tranter, Mol. Phys. 53 (1984) 1091. D.W. Rein, J. Mol. Evol. 4 (1974) 15. S.F. Mason and G.E. Tranter, Chem. Phys. Lett. 94 (1983) 34. R.A. Hegstrom, D.W. Rein, and P.G.H. Sandars, J . Chem. Phys. 73 (1980) 2329. 20. S. Yomosa, J. Phys. SOC. Jap. 44 (1978 602.
21. J. Cioslowski, unpublished results. 22. J. Cioslowski and M. Martinov, J. Chem Phys. 103 (1995) 4967 and the 23.
24. 25. 26. 27.
references cited therein. R.A. Harris and L. Stodolsky, J . Chem. Phys. 74 (1981) 2145. E. Joos and H.D. Zeh, Z. Phys. B 59 (1.985) 223. R.A. Harris and R. Silbey, J . Chem. Phys. 78 (1983) 7330. R.G. Woolley, Chem. Phys. Lett. 44 (1976) 73. J.H. Van Vleck, Phys. Rev. 29 (1927) 727. J. Cioslowski and G. Liu, J . Chem. Phys. 105 (1996) 8187. R.F.W. Bader, Atoms i n Mo e c u l e s : A Quantum Theory, Clarendon, Oxford,
1990. 28. Y. Tal, R.F.W. Bader, and J. Erkku, Phys. Rev. A 21 (1980) 21.
R.F.W. Bader, Y. Tal, S.G Anderson, and T.T. Nguyen-Dang, Isr. J. Chem. 19 (1980) 8. 29. J.L. Villaveces C. and E.E. Daza C . , Int. J . Quant. Chem. Symp. 24 ( 1990) 97. 30. P.G. Mezey, Theor. Chim. Acta 58 (1981) 309. P.G. Mezey, Theor. Chim. Acta 62 (1982) 133. P.G. Mezey, J. Mol. Struct. (Theochem) 103 (1983) 81. 31. R.G. Woolley, Chem. Phys. Lett. 55 (1978) 443. 32. E.B. Wilson, Int. J. Quant. Chem. Symp. 13 (1979) 5 . 33. J. Czub and L. Wolniewicz, Mol. Phys. 36 (1978) 1301. G. Hunter, Int. J. Quant. Chem. 9 (1975) 237. 34. R.G. Woolley, Chem. Phys. Lett. 125 (1986) 200. 35. D.M. Bishop and G. Hunter, Mol. Phys. 30 (1975) 1433. 36. S.J. Weininger, J. Chem. Educ. 61 (1984) 939. R.G. Woolley, J. Chem. Educ. 62 (1985) 1082.
20
37. B. Simon, J . Func. Anal. 63 (1985) 123. 38. S . Graffi, V . Grecchi, and G. Jona-Lasinio, J. Phys. A 17 (1984) 2935. 39. R.G. Woolley, Chem. Phys. Lett. 79 (1981) 395.
40. P.M. Kozlowski and L. Adamowicz, Chem. Rev. 93 (1993) 2007 and the
references cited therein. H.J. Monkhorst, Phys. Rev. A 36 (1987) 1544. R.D. Poshusta, Int. J. Quant. Chem. 24 (1983) 65. J.F. Capitani, R.F. Nalewajski, and R.G. Parr, J. Chem. Phys. 76 (1982)
568.
D.M. Bishop and L.M. Cheung, Int. J. Quant. Chem. 15 (1979) 517. D.M. Bishop and L.M. Cheung, Phys. Rev. A 16 (1977) 640. D.M. Bishop, Phys. Rev. Lett. 37 (1976) 484. D.M. Bishop, Mol. Phys. 28 (1974) 1397. 41. J. Cioslowski, Analysis of Electronic Wavefunctions, in Encyclopedia of Computational Chemistry, P.v.R.Schleyer (ed.), Wi ley, N.York, 1997.
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond
21
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Beyond the Born-Oppenheimer
Approximation
D. B. Kinghorn and L. Adamowicz a 9 ~Department of Chemistry, University of Arizona, Tucson, Arizona 85721, U.S.A. In this review we discuss how one can approach atomic and molecular quantummechanical calculations without assuming the clamp-nuclei approximation. Although such calculations are very rare for chemically interesting systems, the progress in conceptual formulation of the theory in this area and, more importantly, in development of necessary computational tool has progressed to the point where non-Born-Oppenheimer calculations will become possible for more extended molecular systems. If it does happen, it will completely undermine, traditional approaches to the problems of molecular spectroscopy which are based on the concept of the Potential Energy Surface (PES). It may also provide new insight in such important concepts of chemistry as Chemical Bonding and Molecular Structure, which have been the central points in the works of L. Pauling [1]. 1. I N T R O D U C T I O N In an attempt to make the quantum mechanical calculations on molecular systems practical and to provide a more intuitive interpretation of the computed results, it has been a long quest in the electronic structure theory of molecules for establishing a solid base for separating the motion of light electrons from the motion of heavier nuclei. It is believed that the original work of Born and Oppenheimer (B-O) [2] initiated the discussion by the analysis of the diatomic case. Further works of Combes and Seiler [3], who managed with the use of singular perturbation theory to resolve the problem of the diverging series which appeared in the B-O expansion, and particularly of Klein and coworkers [4] who extended the formalism to polyatomic systems, have brought the consideration of the topic to a level of commonly accepted theory. In this context one should definitely mention the pioneering work of Slater [6], who described a scheme which was subsequently advanced by Born [5], to become a more plausible approach than the one formulated in the original B-O paper [2]. Apart from the further refinements of the theoretical grounds for the B-O approach which is closely related to the notion of the Potential Energy Surface (PES), there has been a continuing interest in theoretical consideration of molecular systems where the motions of both nuclei and electrons are treated equivalently. Before we turn our attention to this approach, it should be mentioned that there is a significant body of work, recently reviewed by Langsfield and Yarkony [7], where the departure from the B-O approximation *This work was supported by the National Science Foundation with grant CHE-9300497
22 is made by starting from the conventional P ES approximation and by including electronnucleus coupling operators in the variational or perturbational calculation to facilitate the representation of their collective behaviour. In this review we will not cover these kinds of approaches, but instead we will concentrate on methods which in treating the nuclear and electronic motions in molecules depart entirely from the PES concept. It is particularly interesting how in this type of approach the conventional concepts of molecular structure and chemical bonding will be represented and how different these representations are from the representations of these concepts in the B-O approximation. Particularly the concept of chemical bonding, which at the B-O level is an electronic phenomenon, will now include an additional aspect related to the contribution to the bonding effect from the nuclear dynamics. The view that chemical bonding in molecular systems is related to the collective dynamical behaviour of both electrons and nuclei can only be tested in non-adiabatic calculations. Another motivation for developing the non-adiabatic approach to describe the states of molecules stems from the realization that in order to reach "spectroscopic" accuracy in quantum-mechanical calculations (i.e., error less than 1 #hartree) one needs to account for the contribution resulting from the coupling between motions of electrons and nuclei. Modern experimental techniques, such as gas-phase ion-beam spectroscopy, reach accuracy on the order of 0.001 cm -1 [8]. In order for quantum molecular mechanics to continue providing assistance in resolving and assigning experimental spectra, especially in studies of reaction dynamics, work has to continue on development of more refined theoretical methodology, which accounts for non-adiabatic interactions. In this review we will first describe two approaches which we have used to represent atomic and molecular systems without resorting to the B-O approximations. Next, we will describe two numerical applications of the theory, which led to determining interesting non-adiabatic contributions. In the last section we will consider future theoretical work on a general non-adiabatic approach to an N-particle system with any isotropic interaction potential, including coulombic interaction, which is presently being developed in our group.
2. E Q U I V A L E N T
TREATMENT
OF NUCLEI AND ELECTRONS
2.1. E x p l i c i t s e p a r a t i o n of t h e c e n t e r - o f - m a s s m o t i o n ( M e t h o d I) The general problem being considered is to find the discrete energy levels of a system of particles interacting under an isotropic potential. More specifically, we are interested in two cases: Case 1, where the particles are nuclei and electrons interacting with a coulombic potential; and Case 2, where the systems consist of any types of particles (e.g., atoms in a molecule or in a cluster) and the interaction potential can be modeled by a many-body expansion in terms of some functional bases. One example of the latter case is a calculation of the to-vibrational structure of a molecule, where the potential is given by an analytical fit to the PES determined using ab-initio calculations, in which case the energy levels include all electronic, vibrational and rotational contributions. Both of these cases are modeled by the same general form of the Hamiltonian. To model the physical systems, i.e., write the Hamiltonian, particles are considered to be non-relativistic, charged, point masses interacting under an isotropic potential. The
23 total Hamiltonian then has the familiar form, N ~7 2
Htot - - ~. - ~R~ + V (I[R~ - Rj[] ; i < j ,
i-1...N).
(1)
z
The particles are numbered from 1 to N with Mi the mass of particle i, R / - [Xi Y~ Zi] t a column vector of Cartesian coordinates for particle i in the external, laboratory fixed, frame, V 2Ri the Laplacian in the coordinates of Ri, and []Ri RjI[ the distance between particles i and j. The total Hamiltonian, eqn.(1), is, of course, separable into an operator describing the translational motion of the center-of-mass and an operator describing the internal energy. This separation is realized by a transformation to center-of-mass and internal (relative) coordinates. Let R be the vector of particle coordinates in the laboratory fixed reference frame. R1 R
X1
R2 .
-
Y~ -
(2)
Z1
ZN
RN
Center-of-mass and internal coordinates are given by the transformation T " R ~-+ [r~, r']' ...
MN
-1
1
0
---
.~o
-1
0
1
...
0
.
,
,
9
0
0
...
1
m o
T-
9
-1
_~
_~
m o
m o
|
(3)
where mo - ~ M~. r0 is the vector of coordinates for the center-of-mass and r is a length 3n = 3 ( N - 1) vector of internal coordinates with respect to a reference frame with origin at particle 1
r -
rx
R 2 - R1
r2
Ra - R1
. r~
-
. RN-
(4)
R1
Using this coordinate transformation, and the conjugate m o m e n t u m transformation, the internal Hamiltonian for the problems we are considering (cases 1 and 2 above) can be written as,
H-
1
2
V~ + E - ~ I ~r
vi
Vj
)
+ V (rij, i < j,
i
O...n),
(5)
where the #i are reduced masses, M~ is the mass of particle 1, (the coordinate reference particle), and Vi is the gradient with respect to the x, y , z coordinates ri. The potential energy is still a function of the distance between particles but is now written using
24 internal distance coordinates, rij = ]]r~- rj][ = [ ] R / + I - Rj+I][ with r0j --- rj = []rj[] = ]]Rj+I - RI[]. The kinetic energy term in this Hamiltonian can be written as a quadratic form in the length 3n vector gradient operator, Vr, the gradient with respect to the length 3n vector r of internal coordinates. This gives a compact matrix/vector form of the Hamiltonian with the kinetic energy expressed as a quadratic form in the gradient op erat or, g - - V ' r ( M | I3) ~Tr + V (rij; i < j ,
i - O...n)
(6)
M is an n • n matrix with 1/2#~ on the diagonal and 1/2M1 for off diagonal elements. This is the Hamiltonian we use for variational energy calculations (cases 1 and 2). We make no further transformations or approximations to this Hamiltonian. More information on the center-of-mass separation and form of the Hamiltonian, eqn.(6), can be found in the references[ 9,10,12]. In particular, the total Hamiltonian for a three-particle system with masses {M1, 3/2, M3} and charges {QI, Q2, Q3} interacting under a Coulomb potential is represented (in atomic units) by
__--:=-_~.----:-_~.--:-:-P21 P~ P~ QiQ2 + QiQ3 + Q2Q3 ' MI + + + R13
(7)
Where Ri, Pi are position and momentum vectors for particle i and R/j = [ [ R / - Rj[[. This Hamiltonian is separable into a Hamiltonian for the kinetic energy of the centerof-mass and a Hamiltonian describing the internal interaction energy. A transformation leading to this separation is given by [9] To - - M1 R1 + M 2 R 2 + M a R a
,~o rl -- R2 - R1, r2 - R3 - R1,
mo - Ml + M2 + Ma, M1 M2 ml ---- M ~ + M 2 ' M2 M3 m2 - - / 2 + M 3 '
, po=Pl+P2+Pa, M2 Pl = P2 - m P0, P2 = Pa - ~oP0,
qo Q1 ql -- Q2 q2 - Q3
(8)
Applying this transformation to the total Hamiltonian gives Htot = Hc~ + H where
g~-
p~ 2rno
(9)
and H-
qoql qoq2 qlq2 p~ + p~ Pl"P2 2ml ~m2 + M1 + rl + r2 -F--rl2
(10)
The Hamiltonian in eqn.(10) can be viewed as representing the internal motions of a three-particle system or as the total energy of a two-particle system with fictitious masses ml, m2 and charges ql, q2 interacting with a charge q0 at the origin together with their mutual Coulomb interaction and a momentum dependent "mass polarization potential". Now, for H - , D - , and T - - the example which will be considered in the next sectionwe assign the nucleus to particle one and the electrons to particles two and three giving the nonrelativistic, non-adiabatic, Hamiltonian used in these calculations
H =
V12
2~1
V~
2#2
Vl" V2 MN
I
rl
I
r2
i
?'12
(II)
25
Table 1 Mass values and other constants MH- 1836.152693 #H -- .9994556794 MD -- 3 6 7 0 . 4 8 3 0 0 8 #D -- .9997276305 MT- 5496.921571 #T - - .9998181131 R ~ - 109737.31534cma -1-• 137.03599944 E0 = - # / 2 E0 (H) = -.4997278397 E0 (D) = -.4998638152 E0 (T) = -.4999090565
Here #1 = #2 is the reduced electron mass, MN is the nuclear mass, r l and r2 are relative coordinates described in the center-of-mass transformation, eqn.(8), and we have written the m o m e n t u m operator p as - i V . For the values of the nuclear masses see Table(i). It is possible to write eqn.(11) in a form that makes scaling and perturbation corrections to the infinite nuclear mass approximation more obvious [13]. Since for the cases being considered # = #1 - - # 2 , coordinate scaling Pi = ( # l i n e ) r i gives a Hamiltonian,
H___~ me
~21 2
~2 P~ 2
1 Pl
1 P2
~- ~
1 ] f -
P12
# ~NN
(12)
Vp~ 9Vp2
that has the same eigenvalues as eqn.(ll). The terms in braces make up the infinite nuclear mass Hamiltonian, thus suggesting the following perturbation expansion, E-
#
e0 +51
+e2
+-..
(13)
?Tte
where, g0 - ( H ) ~ is the energy in the infinite nuclear mass approximation and - ( ~ 7 1 . ~72)~ is a first order mass polarization correction.
C1 ---
2.2. Effective non-adiabatic method (Method II) The approach described in the previous section was based on strict separation of the center-of-mass (CM) motion from the internal motion through a transformation of the coordinate system. In our recent work we have also taken a different approach [14-22] which is based on an effective, rather than explicit, separation of the CM motion. The separation is not done at the coordinate level but in the Hamiltonian. To demonstrate the central point of this approach, let us consider again the total Hamiltonian of the system, eqn.(1). This Hamiltonian can always be separated into the internal Hamiltonian, H ~ t , and the Hamiltonian representing the kinetic energy of the CM motion, TCM -- P'~'" 2M
Htot -
TCM + HiNt.
(14)
Due to this separation, the total wave function can always be represented as a product of the (~CM~I~t. The eigenfunctions of TCM are the plane waves ~CM -- exp(ik, r0) with
26 corresponding eigenvalues of k0/2m0. Instead of an explicit separation, let us now consider the following variational functional:
g[gtot]- (q;t~176
TCMI~tot) = (q;tot]Hi,~tlgtot)
<~J~tot [g tot >
(15)
<~ tot ]~ tot )
Both operators, i.e., the total Hamiltonian, Htot, and the kinetic energy of the CM motion, TCM, have simple forms in the Cartesian coordinate system. The full optimization effort can now be directed solely to improving the internal energy of the system because the functional (15) now contains only the internal Hamiltonian. One can expect that after optimization the variational wave function will be a sum of products of the integral ground state and wave functions representing different states of the CM motion:
i
However, since the internal Hamiltonian only acts on the internal wave function, the variational functional, J[gtot], becomes min J[gtot] - rain ~ (d2IntlHt~ - TcMl~Int) ~
(
]
<~,ntl~tnt)
(17)
which, according to the variational principle, is min Y[gtot;-1] > E,,~t.
(18)
Therefore, by minimization of the functional (15), one obtains directly an upper bound to the internal energy of the system. The kinetic energy of the CM motion is not minimized. In the above considerations we demonstrated that the internal energy can be separated from the total energy of the system without an explicit transformation to the CM coordinate system. This is an important point since an inappropriate elimination of the CM can lead to so-called "spourus" states [23], which in turn lead to contradictory results in non-adiabatic calculations. From the general considerations presented in the previous section, one can expect that the many-body non-adiabatic wave function should fulfill the following conditions: (1) All particles involved in the system should be treated equivalently; (2) Correlation of the motions of all the particles in the system resulting from Coulombic interactions, as well as from the required conservation of the total linear and angular momenta, should be explicitly incorporated in the wave function; (3) Particles can only be distinguishable via the permutational symmetry; (4) The total wave function should possess the internal and translational symmetry properties of the system; (5) For fixed positions of nuclei, the wave functions should become equivalent to what one obtains within the Born-Oppenheimer approximation; and (6) the wave function should be an eigenfunction of the appropriate total spin and angular momentum operators. The most general expansion which fulfill the above conditions has the form K
9tot
--
Z cuP(1 , 2, ..., N)[wu(ra ' r 2 #=1
' ..
., r N
) O SN, M] ,
(19)
27
N represent the spatial and spin components, respectively. In the above where wk and ~S,M expansion, we schematically indicated that each wk should possess appropriate permutational properties, which is accomplished via appropriate symmetry projection operator P(1, 2, ..., N). The total wave function should be also an eigenfunction of the ~2 and Sz spin operators. Different functional bases have been proposed for non-adiabatic calculations on threebody systems; however, extension to many particles has been difficult. Due to the nature of the Coulombic two-body interactions, the basis functions should explicitly depend on the interparticle distances to effectively describe the spatial correlation. Unfortunately, for most types of the explicitly correlated wave functions 0.e. functions explicitly dependent on interparticle distances), the resulting many-electron integrals required to calculate the Hamiltonian matrix are usually difficult to evaluate. An exception is the basis set of explicitly correlated gaussian geminals, which contain products of two gaussian orbitals and a correlation factor of the form exp(-/~r2j). These functions were introduced by Boys [24] and Singer [25]. The correlation part effectively reduces or enhances the amplitude of the wave function when two particles with opposite or alike charges approach one another. The application of the explicitly correlated gaussian geminals is not as effective as other types of correlated functions due to a rather poor representation of the cusp. However, such functions form a mathematically complete set [26], and all required integrals have closed forms as was demonstrated by Lester and Krauss [27]. During the last three decades explicitly correlated gaussian geminals have been successfully applied to different problems as, for example, in the calculations of the correlation energy for some closed-shell atoms and molecules, intermolecular interaction potentials, polarizabilities, Compton profiles, and electron scattering cross sections [28-35]. Explicitly correlated gaussian geminals were also applied to minimize the second-order energy functional in the perturbation calculation of the electronic correlation energy. Pan and King [36] demonstrated that rather short expansions with appropriate minimizations of nonlinear parameters lead to very accurate results for atoms. The same idea was later extended to molecular systems by Adamowicz and Sadlej [37-43]. We should also mention a series of papers by Monkhorst and co-workers [44-48] where explicitly correlated gaussian geminals were used in conjunction with the second-order perturbation theory and the coupled cluster method. In a non-adiabatic calculation, the spatial part of the ground state N-particle wave function eqn.(19) can be expanded in terms of the following explicitly correlated gaussian functions: N
N N
/
- E
i
- Rj I
i=1 j>i
.
(20)
Each w, depends on orbital exponents, a:~, orbital positions, R ~ , and correlation exponents, /~. After some algebraic manipulations the correlation part may be rewritten in the following quadratic form cot, -
exp
(N -
~
i--1
)
o~7lR~ - R ~ A~[2 _ RBt, R T ,
(21)
28 where the vector R is equal to R = (R1, R2, ..., R N ) ,
(22)
and the matrix B k is constructed with the use the correlation exponents (/3~ = 0):
Bu _
N # ~-'~j= 1 ~ l j
--~ltt2
"'"
--~I/~N
-~1~2
~2N:~ fl~j
".
--~Y
:
9
9
--Z~N
--]~2~N
.
(23)
o
-''
EjSI
Z~Vj
We will now demonstrate that the variational wave function in the form eqn.(20) can be formally separated into a product of an internal wave function and a wave function of the CM motion. This is mandatory for any variational non-adiabatic wave function in order to provide required separability of the internal and external degrees of freedom. To demonstrate this let us consider an N-particle system with masses ml, m z , ' - ' , raN. An example of a wave function for this system which separates to a product of the internal and external components is --rlRcf
)
C. exp
-~~
=
7 iuj r #2
9
(24)
i=1 j>i
It can be shown that with the use of the matrices N
tt
Ej= 7 j ?~
-71": N Ej_-,
_ ~,ltt2 .
.
... ...
--7 N
...
9
...
ZY-I
,
(25)
j
and m 2
?Tt 1 Trt2
m l 7Yt2
Trt 2
:
:
m lm u
--razing
ru -
...
ml mN
9 9 9 m2?TtN
-. ...
.
(26)
m 2N
the total wave function, ~tot, can be represented as: ~tot - ~
Cu exp ( - ( R 1 , R2, ..., RN) (lk/I + ru)(R1 , R2, ..., RN)T) ,
(27)
which has the same form as the function eqn.(20) with R~, = 0 and a ~ - g~ja~, i.e.,
qJt~ -- E Ctt e x p
(-(R1,
R 2 , ...,
n y ) (AU + B")(R1, R2, ..., RN)T) 9
(28)
The main idea of the above presented methodology rests in treating non-adiabatically the N - p a r t i c l e problem in the Cartesian space without reducing it to an ( N - 1)-particle problem by explicit separation of the CM motion. One can ask what advantages does this approach have in comparison to the conventional one? One clear advantage is that we
29 avoid selecting an internal coordinate system - a procedure that is not unique and may lead to certain ambiguities 9 The work in the Cartesian space makes the physical picture more intuitive and the required multiparticle integrals are much easier to evaluate. Those two features certainly are not present if more complicated coordinate systems such as in polyspherical coordinates [45] are used. Also, we noted earlier that the other factor which should be taken into consideration is the proper "ansatz" for the trial variational wave function incorporating the required permutational symmetry. In our approach the appropriate permutational symmetry is easy to implement through direct exchange of the particles in the orbital factors and the correlation components. This task, however, can be complicated when one works with a transformed coordinate system. The last problem which should be taken into consideration relates to the rotational properties of the wave function. Definitions of the appropriate rotational operators in terms of the Cartesian coordinates are straightforward. However, in a transformed coordinate system, the operators representing the rotation of the system about its CM can be complicated and may lead to significant difficulties in calculating the required matrix elements. 3. G R O U N D - S T A T E
WAVE FUNCTION
As mentioned above, the centerpiece of our methodology is use of explicitly correlated gaussian basis functions which we will now write in a more general form, Ck -- 1-I r--~3 ~'
exp [-r'(Ak | I3)r]
(29)
i<j
In the above equation the operation | called the Kronecker Product, is defined for A (an m • n matrix) and B (a p • q matrix) matrixes as partitioned mp• nq matrix whose ij th partition is a~j times B:
I a11B "" alnB A|
-
: am l B
".. 999
:
]
(30)
amn B
These functions, hereafter referred to simply as "r will serve as multi-parameter expansion functions for representing the variational non-adiabatic wave functions corresponding to the ground rotational states ( J - 0) 9 In Ck the term l-Ii<j ,ij mk~ is a product of "distance" coordinates, rij, raised to powers mkij (positive negative or zero). The exponential component is an explicitly correlated gaussian with r representing a length 3n column vector of internal (relative) coordinates (r t denotes the transpose of r, i.e., a row vector), and Ak is an n • n symmetric matrix of exponent parameters (usually positive semi-definite). The Kronecker product of Ak with the 3 • 3 identity matrix, I3, insures rotational invariance. Using the vector form, eqn.(34), it is easy to show that Ck is rotationally invariant; that is, invariant to any orthogonal transformation. Let U be any 3 • 3 orthogonal matrix (any proper or improper rotation in 3 space); then the action of U on Ck is to transform the quadratic forms in the pre-multiplying factors and exponential factor as (using the exponential factor as an example): ((In | U) r)' (Ak | I3) (In | U ) r
=
r' (In | U') (Ak | Ia) (In | U) r
(31)
30
=
r'(Ak | U'U)r
(32)
=
r'(Ak @ I3)r
(33)
leaving Ck invariant. Hence, any expansion in Ck will be isotropic in ~3. The matrix/vector form of Ck, eqn.(34), allows us to exploit the powerful matrix differential calculus, described by Kinghorn[ll], for deriving elegant and easily implementable mathematical forms for integrals and their derivatives required in variational calculations. Alternatively, Ck can be written purely in terms of the vector variable r, Ck -- H [r'(Jij | I3)r] ~
exp [-r'(Ak |
(34)
i<j
where ri~ term was written as a function of r using the matrix (Jij | 13) with Jij defined as an n x n matrix with l's in the ii and j j positions, -1 in the ij and ji positions and 0's elsewhere[9,11], It can be further shown that Ck can be expressed using only distance coordinates:
i<j
where, Bii +
Aij -
~#i,j
(Bik + Bkj) ,
- ( B i j + Bye),
i -- j
ir j
(36)
The desired permutational symmetry can be accounted for in the Ck using a projection method. Consider a system of N particles invariant under the action of a group, G, represented by a set, {P~ea}, of N x N permutation matrices. A model potential represented as an expansion in Ck is a function of the n = N - 1 component vectors of r, the relative coordinates. The permutation P~ acting on the N particle coordinates induces a transformation on the center of mass and relative coordinates given by
TD~T -1 -- I3 G ~ ,
(37)
where T is the transformation matrix given in eqn.(3). The right hand side of this expression is the direct sum of the identity acting on the center of mass coordinates, r0, and ~-~, which is an n x n "permutation" matrix acting on the component vectors of the relative coordinate vector r. The action of the permutation represented by P~ on Ck is then
P~Ok -- IX [r'(7:JijT~ | I3)r] ~
exp [--r'(T'~Ak7~ | I3)r].
(38)
i<j
The action of the totally symmetric representation of the group G on Ck is thus induced by the projector ~ c a ~-~- This method of symmetry projection on correlated gaussians is discussed in more detail in the references[9,10,12].
31 4. V A R I A T I O N A L
CALCULATIONS
The wave functions are optimized for the ground state energy by minimizing the Rayleigh quotient
c'H(a)c (~,c} c'S(a)c
E (a; c) - min ~ .
(39)
H (a) and S (a) are the Hamiltonian and overlap matrices which are functions of the nonlinear parameters contained in the basis set exponent matrices Lk. We write a for the collection of these nonlinear parameters, c is the vector of linear coefficients in the basis expansion of ~. Thus, if we let N be the number of basis functions, the.energy is, in general, a function of Nn (n + 1)/2 + N variables which in our particular case is 4N variables. Obviously a reduction of N variables could be achieved by solving the eigen-problem associated with eqn.(39) to obtain the linear coefficients and then iteratively optimize the nonlinear parameters. However, we found that much more thorough optimization could be achieved by letting the optimizer simultaneously vary both the linear and nonlinear parameters. The optimization software employed in our calculations was the package TN by Stephen Nash[63] available from netlib[64]. TN is a truncated Newton method utilizing a user supplied gradient. The analytic gradient of the energy functional was derived using matrix differential calculus[11,12]. We will now briefly describe certain features of the optimization procedure. For simplicity, in this discussion we will use correlated gaussians without pre-exponential r~ factors. A general form of an n-particle correlated gaussian of this kind is the negative exponential of a positive definite quadratic form in 3n variables: Ck = exp [ - r ' (
L
kL !k | Ia)r]
(40)
Here r is a 3n z 1 vector of Cartesian coordinates for the n particles, Lk is an n x n lower triangular matrix of rank n and I3 is the 3 x 3 identity matrix, k would range from 1 to N where N is the number of basis functions. The Kronecker product with I3 is used to insure rotational invariance of the basis functions. Also, integrals involving the functions Ck are well defined only if the exponent matrix is positive definite symmetric; this is assured by using the Cholesky factorization LkL~. The following simplifications will help keep the notation more compact:
LkL~ = Ak
(41)
Ak+Az
=
Akl
(42)
ii
=
A|
(43)
With the above notational simplifications, the most compact form for Ck is Ck = exp
[-r'ftkr]
(44)
The function Ck can be thought of as both a scalar valued vector function of r and a scalar valued matrix function of Lk. The Hamiltonian and overlap matrixes constructed from a basis of the Ck's are then matrix functions of the matrices Lk, and the energy functional is
32 a differentiable function with respect to the independent variational parameters contained in the column vector a - [(vech L 1 ) ' , ' - - , (vech
LN)']',
(45)
where the "vech", "vector half", operator is defined as follows. Let A be a square n x n matrix. Then vech A is the n (n + 1 ) / 2 x 1 vector obtained by stacking the lower triangular elements of A. For example, if n = 3,
all a21 vechA =
a31
.
(46)
a22
a32 a33 Similarly, the "vec" operator transforms a matrix into a vector by stacking the columns of the matrix one underneath the other:
vecA -
al a2 .
.
(47)
an
There are three types of integrals which need to be calculated to determine the value of the Rayleigh quotient egn.(39): overlap, kinetic energy and potential energy integrals. Here we will briefly review the integral evaluation. More details about derivation of the formulas for the integrals and integral derivatives can be found in Ref.[65] as well as in Refs. [14-18,20-22]. The overlap integral over normalized Ck and @ functions has the following simple form:
Skt =
_ 23n/2 ('Lk] 'Ll]) 3/2
(((~k [(~k)((~l ] r
--
[Aktl
"
(48)
The kinetic energy operator T-- -Vr
" ( M | I3) Vr,
(49)
where V~ is the gradient operator with respect to r, the vector of all particle coordinates, and M is a constant matrix containing the mass coefficients, 1/2#~ on the diagonal, for the kinetic terms, and for the off diagonal, mass polarization terms. The integral involving Ck, @ has the following form:
1/2m
Tkt = 6Skt tr
[MAkA-~tlAt].
(50)
Evaluation of the potential energy integral begins with a reasonably straight forward single term integral, then the potential energy operator is written in matrix form using the coordinate vectors and two newly derived basis matrices. With the operator in matrix
33 form a strikingly similar form for the matrix elements of this operator is then displayed. The integral
(51)
r~j
can be evaluated with the help of the identity 1
rij
2 foco x/~ e-
__
~2,.2
~J du.
(52)
The square of the interparticle distance can be written as
r2j-r~.r~+rj.rj-ri.rj-rj.r~
with
r~2
__ r 2 _
ri
9 ri,
(53)
where ri is the coordinate vector of the i th particle. If Jij is defined as the matrix,
J{j
_ [ E{{
i - j
(54
[ Ei{ + Ejj - E{j - Eji i # j '
where Eij is the m be written as
x
)
n matrix with 1 in its ij th position and O's elsewhere, r[j can also
,~ - ,' (g{j | i3)r.
(55)
Using (55) and (52) the integral (51) can be evaluated as follows.
11r >
2 (r x/,-~
:
=
e-~2, -'(J~j|
du lr
~0(X)
2
co co
2
oo
2
co
u2
v ~ f0 ~3~/2 [nktl 2
du
-a/2 Ii ~ . + u 2J,jn;~ll -~/~ du
co
du.
The determinant in this last integral is
I §
(57)
With this result we can finish evaluating the integral
(r
1
ViU Ir
2
- - ~ (r I r
(tr
[Jijnkll])
-
. 1/2
(ss)
34 The potential energy operator involving coulombic interaction, after separation of centerof-mass, is:
~/= ~ i
qoqi + ~, ri i " .vii
(59)
where q0 is the charge at the origin, typically a heavy nucleus, and the qi are the charges of the remaining particles. This operator can be written in the matrix form
(60)
- - t r [(~ (r2j) [-1/2]] - - ( v e c Q')' (T~n~n3vecrr') [-1/2] , where Q is an n • n upper triangular matrix of the charge products
Q-
qoql O. 0
ql q2 9 9 9 ql q,~ qoq2 q2q3 q2qn. 9 99
0
(61)
qoq,,
and (r~j)[-1/2] is the Hadamard, i.e. term by term, reciprocal square root of the matrix
(r2j) whose ij th element is r2j (recall that the diagonal terms are defined as in equation (53)). The basis matrix 9v~3 is defined by its action on the vector vec rr' 9~,~3vecrr' = vec (r~..rj)
(62)
That is, 2-~3 is the n 2 • (3n) 2 matrix that transforms the vec of the (3n) 2 • (3n) z matrix rr' into the vec of the n 2 • n 2 matrix whose ij t~ element is r~.rj where r~ is the coordinate vector for particle i. The basis matrix T/~ is defined by 7 ~ vec (r~. r j ) -
vec (r2j).
(63)
Both 9v~,~ and 7~n are basis matrices for linear structures and are useful in their own right. In general, ~ is the n 2 • n z matrix that acts on the vec of the arbitrary n • n matrix (ao) as follows: 7~vec (aij)-
vec ( { aii
aii + ajj - aij - aji
i--j) i ~ j.
(64)
Note the similarity to the effects of the matrix Jo in equation (54). In fact, the rows of ~ are precisely (vec Jij) ~ This relation gives a possible construction for ~,~. However, 7 ~ can also be defined in terms of the duplication matrix 7:7~ and the matrix Af~ from the section on linear structures: 7~
-- 2Af~ (I~ | I~)(2Af~ - DJ)~)
- DJ)'~.
(65)
Here In is an n • n matrix of l's. With all of these results in place, the potential energy integral can be written in matrix form as,
(r
(vec Q')' (P~F~3 vec
ppt)[-1/2] [~l} _
~2
(~k [ r
(vec Q')'
vec
(66)
35 Thus, using normalized basis functions, the potential energy matrix elements, V~k, are:
:
2
Q')'
1)
(67)
An approximation to an exact wave function is written as a finite expansion of correlated gaussians, the secular equation is constructed, and a selected energy eigenvalue is minimized with respect to the linear expansion coefficients and the independent nonlinear parameters contained in the matrices Lk of the individual basis functions. Solution of the resulting eigensystem automatically determines the optimal linear expansion coefficients as the eigenvectors for a given set of the nonlinear parameters, but optimization of the nonlinear parameters is a very difficult and computationally extensive task. The first order necessary conditions for an energy minimum require the energy gradient (equivalently differential or derivative) with respect to all of the matrices Lk to vanish. This section begins the construction of this gradient in an exact analytic form. The secular equation (H
-
-
o
defines c as an implicit function of the N x N matrices H and S. H and S are themselves functions of the N n (n + 1)/2 x 1 vector a of nonlinear exponential parameters contained in the matrices Lk, equation(45). Taking the differential of the secular equation gives d [(H - cS) c] =
d (H - eS) c + (H - eS) dc (dH - ( d c ) S - c dS) c + (H - cS) dc
(69)
multiplying from the left by c', rearranging, and noting c' (H - cS) - 0 results in c' (dH - e dS) c c'Sc "
dc -
(70)
This is the differential with respect to the matrices H and S. For the rest of this derivation the scalar term c'Sc will be dropped from the calculations for convenience. dc -
tr [ c c ' ( d H - c d S ) ]
=
(veccc' )' vec (dH - c dS)
=
( vec cc')' :D~ vech (dH - c dS)
=
(:D~veccc' )' ( d v e c h g - c d v e c h S )
=
(vech [2cc'- diagcc'])'( OvechHOa' - c0vech S)0a,
da.
(71)
Bringing back the scalar term c'Sc, the energy gradient with respect to a is ~'~c - ~
1
(0vechH 0vechS)' Oa' - ~ Oa' (vech [2cc'- diagcc']).
(72)
The matrix (0vech H / O a ' - c 0vech S/Oa') in the gradient above, has dimension N (N + 1)/2 • N n (n + 1)/2. It is also sparse with N 2 n (n + 1)(N + 1)/4 total elements,
36
N2n (n + 1)/2 non-zero elements and N2n ( N - 1)(n + 1)/4 zeros. The non-zero terms in the matrices 0vech H/Oa' and 0vech S/Oa' are contained in the 1 x n (n + 1 ) / 2 vectors 0 Hkt/O (vech Lk)' and 0 Skt/O(vech Lk)' equivalently 0 Hk~/O( vech Lz)' and 0 Skt/O (vech L~)'. In the next section of this review we present these derivatives. The power of the matrix differential calculus is immediately apparent when one actually computes an analytic gradient for a matrix function. The ease with which results are obtained and the concise compact form of the results seems almost miraculous at times. When the derivatives presented here where first formulated, the results were so surprising that numerical conformation was performed immediately. All of the following matrix derivatives have been confirmed by finite differences term by term on random matrices. The following derivations should not be difficult to follow if the preliminary matrix results presented at the beginning of this review are within easy reach. The derivatives are all with respect to the independent parameters contained in the matrix Lk, i.e. vech Lk. The derivative of the normalized overlap matrix element, Skl, equation (48), presents no new challenges and it is a simple exercise to show OSkl
(73)
/:3 ( vech Lk)'
AkA-~tlAI
To find the derivative of the kinetic energy matrix element first, note that may be written as
nkn-kllnl - (n-kl -3t- A / l ) -1
(74)
and 3
dSk~- -~Skztr
[(L~ 1 -
2L~Ak-l1) dLk].
(75)
Using this result after some transformations the derivative of the kinetic energy integral becomes:
OTkl),=(vech[3Tkl((L-kl)'-2A-~llLk)-f-12Skl(A-kllAlMAlA-~llLk)])
0 ( vech Lk
'
(76)
The formula for the potential energy uses a Hadamard power, because of this,the equation: ( vec A O vec B)' vec C = ( vec A)' (vec B O vec C)
(77)
is needed, where the Hadamard product of A and B each of the m x n order is defined as the m x n matrix
A 0 B = (aijbij)
(78)
(the Hadamard product is simply the corresponding term product of two matrices). The differential of 2 Vk, -- ~ S k l
( vec Q')' ( 7 ~ vec A~-i1) [-1/2]
(79)
37 after some transformations the derivative of the potential energy can be written as:
0 ( vech L k
2 ( vec Q' Q ~/-~Skl
2
(T~nvec d-kll)[-3/2])' T~n (AkllLk ~ n-kl1) Ctn
(80)
This completes the derivation of the derivatives needed for the gradient of the energy functional. The compact matrix forms of these results can be manipulated using matrix algebra and are readily implemented using optimized computer subroutine libraries. 5. S A M P L E A P P L I C A T I O N S 5.1. E x p l i c i t s e p a r a t i o n of t h e c e n t e r - o f - m a s s m o t i o n in v a r i a t i o n a l calculat i o n s of e l e c t r o n affinities of H - , D - a n d T First we will demonstrate highly accurate (errors less than 10-acm -1) non-adiabatic calculations for H - , D - , and T - accomplished with Method I. These calculations reflect the capability of the explicitly correlated gaussian basis to describe subtle non-adiabatic effects in small atomic systems. Experimental values for the electron affinity of Hydrogen have been steadily improving. In the mid 70's a value of 6081• -1 was reported by Feldmann[49] and independently by McCulloh and Walker[50]. Greater accuracy was then achieved by Chupka, Dehmer and Jivery[51] giving 6083(+11,-3) cm -1. In 1979 Scherk[52] reported a value of 6085.5:t:3.3cm -1. Then in 1991 Lykke, Murray and Lineberger[53] achieved a 20 fold improvement in accuracy obtaining 6082.99+0.15cm -1 and 6086.2• -1 for the electron affinity of Hydrogen and Deuterium, respectively. Recent conversation with Professor Lineberger suggested that improved error limits, especially for Deuterium, may be possible. In anticipation of these improved experimental results, the present work gives highly accurate theoretical predictions for the electron affinity of Hydrogen, Deuterium and Tritium. There have been several highly accurate ground state energy calculations for the negative Hydrogen ion[54-59]. The most accurate of these calculations[56-58] appear to be converged to the point where the uncertainties in the calculated energies is less than 10 -13 hartree (10-Scm -1). This uncertainty is several orders of magnitude less than the uncertainty in even the most accurately known fundamental physical constants. For example, the Rydberg constant is uncertain in the 10th digit[60] and the atomic mass of Hydrogen is uncertain in the 11th digit[61] (the electron mass is uncertain in the 8th digit[61]). As remarkable as these calculations are, they suffer from one major shortcoming- the computed wave functions are for a fictitious system with an infinitely heavy nuclear mass. It is common to add corrections to these computed energies. The corrections consist of scaling using a reduced-mass-corrected Rydberg constant, the reduced Rydberg, and adding a first order perturbation correction for what is usually called mass polarization[62]. These corrections to the infinite nuclear mass approximation are very good; however, they do not account for the entire energy shift due to nuclear motion. For highly accurate energy calculations on the systems H - , D - , and T - (and probably for Helium), it is necessary to use a Hamiltonian that explicitly includes the nuclear motion, i.e., the non-adiabatic
38 Hamiltonian. We demonstrate this in the present work by providing highly accurate variational upper bounds for the non-relativistic ground state energy of H - , D - and T - . These bounds are computed directly using the finite mass non-adiabatic Hamiltonian. Our H - energy bound is nearly 13 nano-hartree lower than can be achieved using the infinite nuclear mass approximation plus Rydberg scaling and first order mass polarization corrections, and we know from the essentially exact H - result of Drake[56] that the exact energy lowering should be closer to 18 nano-hartrees. It appears that this 18 nanohartrees can only be recovered rigorously using the non-adiabatic Hamiltonian. This is an energy correction approximately equal to that of the Lamb shift (17 nano-hartrees [56]) and thus represents a significant deficiency in the infinite nuclear mass approximation. We use our non-relativistic energy bounds together with relativistic and other small corrections computed by Drake[56] to obtain what should be reliable theoretical values for the electron affinity of Hydrogen, Deuterium and Tritium. We hope that this work will spark renewed interest in high accuracy electron affinity determination among both experimentalists and theoreticians. For the H - system and its isotopomers after separating the CM motion from the SchrSdinger equation, the problem is reduced to a two pseudo-particle problem. In the basis functions defined in eqn.40 r (r~ , r 2!) ! is the 6 x 1 vector of relative coordinates defined above. The ground state spatial wave function (symmetric with respect to exchange of electrons) is then given as the symmetry projected linear combination of the Ck, 9 (r) - ~ ck (exp [ - r ' (LkL'k | I3)r] + exp [ - r ' (7-'LkL'kT | I3)r]),
(81)
k
where 7_ is the permutation matrix ( 0
1) 1 0 " The energy shift due to finite nuclear mass is (82)
AENM - ( g ) f - ( g ) ~ .
Using eqn.(13) AENM can be given to very good approximation, for two electron systems, by
= = =
"
me
MN
1- ~ #
MN
((H)
(H)o o cr
]
m . M u (Vl" V2)cx)
--"-'~ (Vl" V2)). me cr
--(H)oo (83)
The first term in square brackets in eqn.(83) is the "reduced Rydberg scaling" which could be interpreted as the energy change that would result from changing units from hartree(- 2R~) to reduced Rydberg units 2RM -- 2 ( # / m e ) R ~ . The second term is a perturbation, the "mass polarization e_nergy", again in~_~reduced Rydberg units. In Table (2) values for A E N M , A E N M and A = A E N M -- AENM are given using our infinite nuclear mass wavefunction and non-adiabatic wave functions for Hydrogen, Deuterium and Tritium anions. The first order approximation to the nuclear mass energy shift
39 for H - leaves 17 nano-hartree unaccounted for, which is unacceptable for high accuracy calculations. Drake[56] has computed a total mass polarization correction given as AE,~, = .03287978125 ( # / M ) -.059779493 ( # / M ) 2 .
(84)
The coefficient of the first term in eqn.(84) is (VI" V2}~. The coefficient of the second term is derived by subtracting the leading term in eqn.(84) from the total energy shift due to mass polarization. This total mass polarization energy shift is computed by subtracting the infinite nuclear mass energy from the non-adiabatic energy of H - (derived from the Hamiltonian in eqn.(12)). Thus, using the terms in eqn.(84) for the second and third terms in the perturbation expansion, eqn.(13) will, of course, give the exact result for H and should give excellent approximation for D - and T-. However, we feel that since the non-adiabatic energy for H - is used to obtain this expansion, the non-adiabatic energy for the other isotopes may as well be computed without approximation. This section is concluded with Table(3) presenting the variational energy upper bounds computed in this work. Table(3) includes our estimation of the exact energy upper bounds. Our infinite nuclear mass and H - energy values lie 4.9 nano-hartrees above the essentially exact values reported by Drake[56] and we believe that our non-adiabatic calculations for D - and T-contain this same deficiency. Therefore, we estimate the exact energy upper bounds by correcting our computed values for this deficiency. Electron affinities for Hydrogen Deuterium and Tritium are presented in Table (4). In the calculation of the electron affinities the energies for the neutral atoms are those given in Table (1). The calculations include highly accurate small corrections computed by Drake[56]. These include relativistic, relativistic recoil, Lamb shift and finite nuclear size corrections, giving a total correction of 0.307505cm -1, labeled AE~orr in Table (4). We use this correction with Rydberg scaling for Deuterium and Tritium. The electron affinity is then given by E A = Eatom - Eanion- AE~o~
(ss)
where Eatom is the energy of the neutral atom, Eanion is the non-adiabatic energy of the negative ion and AEcorr is the correction described above. Included in Table(4) are the computational results for Hydrogen of Pekeris[54] and of Drake[56], and the experimental results for Hydrogen and Deuterium of Lykke, Murray and Lineberger [53]. Also included is our estimates of the exact electron affinity computed from the energy estimates given in Table(3). The result from Pekeris in Table(4) includes only first ~rder mass polarization correction and Drakes relativistic corrections AEco,.r. 5.2. C a l c u l a t i o n on H D + w i t h effective n o n - a d i a b a t i c m e t h o d Second example concerns non-adiabatic calculations on the H D + cation performed with Method II. The hydrogen molecular ion, H D +, has played an important role in development of molecular quantum mechanics. Many different methods has been tested on H D +. Since the interelectonic interaction is not present, very accurate numerical results could be obtained. One of the most accurate non-adiabatic calculations was performed by Bishop and co-workers [43]. Bishop in his calculations used the following basis functions
40
Table 2 Energy shift due to finite nuclear mass
AENM (H-) /kENM (H-) A(H-) AENM(D-) /kENM (D-)
A(D-) AENM (T-) /kENM (T-) A (T-)
.0003051354 .0003051531 1.77 • 10 -s .0001526918 .0001526963 4.49 • 10 -9 .0001019683 .0001019703 2.01 • 10 -9
66.96948cm-1 66.96947cm -1 .00389cm -1 33.51198cm -1 33.51296cm -1 .00099cm -~ 22.37946cm -1 22.37990cm -1 .00044cm -1
Table 3 Energy upper bounds This work "Exact" estimate
H-.5274458762 -.5274458811
D-.5275983198 -.5275983247
T-.5276490433 -.5276490482
expressed in terms of elliptical coordinates with R being the nuclear separation: r
1
(s6)
rl, R) -- exp(-oL~)cosh(j3~7)~'rlJR -3/2 exp[-~(-x2)]Hk (x),
where x = 7 ( R - 5); Hk(x) are Hermite polynomials: c~, /3, 7 and 5 are adjustable parameters chosen to minimize the lowest energy level; and i, j and k. The energy obtained using this expansion was -0.597 897 967 (a.u.) In the effective approach, without explicit separation od the center-of-mass motion, the non-adiabatic wave function for the HD + ion is expressed in terms of explicitly correlated
Table 4 Electron affinity of Hydrogen, Deuterium and Tritium. The term AE~o,.,. contains relativistic, relativistic recoil, Lamb shift and finite nuclear size corrections
EH -- EHAEco,.,. EA "Exact" estimate Drake[56] Pekeris[54](using AEco,.,.) Lykke[53] (experiment)
Hydrogen 6083.4058cm -1 .307505cm- 1 6083.0983cm -1 6083.0994cm -1 6083.099414cm -1 6083.0909cm -1 6082.99+0.15cm -1 ......
Deuterium 6087-0201cm-1 .307589cm- 1 6086.7126cm -1 6086.7137cm -1
6086.2+0.6cm -1
Tritium 6088.2233cm -1 .307616cm -1 6087.9157cm -1 6087.9168cm -1
41
gaussian functions" ~tot - ~ M
c k w k ( r n , r u , re)O(D)O(H)O(e),
(st)
where O(D), O(H) and O(e) represent the spin functions for deuteron, proton and electron, respectively, and the spatial basis functions are defined as k 2 e - 3kDr2i_iD -- 3.~r.~ k 2 -- 3 w k 2 r v q ], wk -- exp [--O~kDr~)- OekHr 2 --oler
(S8)
where rD, rH and re are the position vectors of the deuteron, the proton and the electron, respectively, and rHD, rile and rDe denote the respective interparticular distances.
Table 5 Ground state internal energies (in a.u.) computed with basis set of M functions for the H D + molecule. M E~ 10 -0.562 111 18 -0.586 854 36 -0.593 885 50 -0.595 369 60 -0.595 665 100 -0.596 435 200 -0.596 806 The best literature v a l u e - -0.597 897 967 a.u.
Table 6 Expectation values of the square of the interparticle separations (in a. u.) computed with basis sets of different lengths (M) for the H D + molecule.
M <~D> 60 I00 200
4.4794 4.4119 4.3145
<~> 3.6312 3.5982 3.5549
<~> 3.6318 3.5906 3.5517
The values of the total ground state energy of H D + calculated with gaussian basis set of different lengths are presented in Table 5. One can see a consistent convergence trend with the increasing number of functions. Comparing our best result obtained with 200 gaussian functions of-0.596806 a.u. with the result of Bishop and Cheung, -0.597898 a.u., indicates that more gaussian functions will be needed to reach this result with our method. Following evaluation of the H D + ground state wave function we calculated the expectation values of the squares of interparticular distances. The H D + system should possess slight asymmetry in the values for < r 2He :~ and < r~e > leading to a permanent dipole
42 moment.
There is a simple reason for the asymmetry of the electronic distribution in
H D +. For deuterium, the reduced mass and binding energy are slightly larger, and the corresponding wave function smaller, than for hydrogen. This leads to the contribution of the ionic structure H+D - being slightly larger than that of H - D +, and in affect to a net moment H+~D -~. The values of < r2HD >, < r2He > and < r~e > for the wave functions of different lengths are presented in Table 6, indicate that the electron shift towards the deuterium nucleus is correctly predicted in this approach. 6. G E N E R A L
N-BODY NON-ADIABATIC
WAVE FUNCTION
As we have suggested recently [68] the technique involving separation of the CM motion and representation of the wave function in terms of explicitly correlated gaussians is not only limited to non-adiabatic systems with coulombic interactions, but can also also extended to study assembles of particles interacting with different types of two- and multi-body potentials. In particular, with this approach one can calculate the vibrationrotation structure of molecules and clusters. In all these cases the wave function will be expanded as symmetry projected linear combinations of the explicitly correlated Ck of eqn.(29) multiplied by an angular term, Y~M"
LMr = "Pr ~ Y~MCk. k
(89)
Here 7~r is an appropriate permutational symmetry projection operator for the desired state, F, and YLkM is a product of coupled solid harmonics labeled by the total angular momentum quantum numbers L and M. Permutational symmetry is handled using projection methods in the same manner as described for the potential expansion in the previous section. Again, the reader is referred to the references for details[9,10,12]. Y~M is a vector coupled product of solid harmonics[69] given by the Clebsch-Gordon expansion, n
Y{M =
~ (LM; k I l l m l . . , l~m,~) 1-~ Ytj,~j. {lj, mj} J m l + . . . + m,~ = M
(90)
The solid harmonics are given by [ 2 / + 1 (l + m)' ( l - m)! Ytm (rj) -- [ 47r "
2zp+'~ (p + m)! p! (1 - m - 2p)!
(91)
The Yt,~ (rj) are single particle angular momentum eigen-functions in relative coordinates which transform the same as spherical harmonics, i.e., have the same eigen-values. Since the Ck are angular momentum eigen-functions with zero total angular momentum, the product with Y~M can be used, in principle, to obtain any desired angular momentum eigen-state. Note the k dependence of Y~M; this is included since there are many ways to couple the individual angular momentum lj to achieve the desired total angular momentum L and it may be necessary to include several sets of the lj in order to obtain a realistic description of the wave function. Varga and Suzuki[66] have recently proposed
43 representing the angular dependence of the wave function using a single solid harmonic whose argument contains additional variational parameters, u = (Ul, u 2 , " - , u~): n
~LMr -- VrYLM(V) ~ CkCk, with v k
~ u~r~.
(92)
i=1
There appears to be several advantages in doing this and we are investigating the possibility of using this approach in our full N-body implementation. The strict separation of the angular and "radial" variables is eqns.(89) and (92) allows separate consideration of the vibrational states with different total angular momentum quantum number, L. The magnitude of the Coriolis coupling for the particular L-state will determine whether the most general form, eqn.(89), or more simplified form, eqn.(89), of the total wave function should be used. There have been several highly accurate non-adiabatic variational calculations on atomic and exotic few particle systems using simple correlated gaussians [10,12,18,75,66,67]. By simple we mean they only contain the exponential part of the Ck, (no rij pre-multipliers). However, attempts at non-adiabatic molecular calculations have been plagued by problems with linear dependence in the basis during energy optimizations. This problem occurs in calculations on atomic systems also, but to a much lesser extent. We anticipate that we understand this phenomena and that the basis including pre-multiplying powers of rij will eliminate or at least drastically reduce the linear dependence problems. Our reasoning is as follows: In systems with more than one heavy particle there will be large particle density away from the origin in relative coordinates. That is, the wave function will have peaks shifted away from the origin. There are three ways to account for this behavior in the wave function using correlated gaussians: 1) use correlated gaussians with shifted centers, i.e., exp[- (r - s)'A (r - s)]; 2) Use near linearly dependent combinations of simple correlated gaussians with large matched + linear coefficients, or 3) Use pre-multiplying powers of rij. The first option is unacceptable since it results in a wave function which no longer represents a pure angular momentum state. The second option is what we believe causes the linear dependence and numerical instability which we are trying to avoid. The third option is what we are proposing. The linear dependence that we have observed in our calculation using the simple correlated gaussians looks, in some sense, like an attempt by the optimization to include in the wave function derivatives of the basis functions with respect to the non-linear parameters. The near linear dependent terms resemble numerical derivatives. Removal of these near linear dependent terms has an adverse affect on the wave functions, as manifested by poor energy results, but leaving them in leads to numerical instabilities which hinder optimization or cause complete collapse of the eigen-solutions. Now, derivatives of simple gaussians with respect to non-linear parameters, elements of the matrices Ak, bring down pre-multiplying (even) powers of r~j. Thus, explicitly including pre-multiplying powers of rij in the basis functions should add the needed flexibility to the basis in a numerically stable way. Also, we expect the rate of convergence to be improved by these pre-multiplying ri~ terms in the same way that they effect convergence in the Hyllerass basis. The Ck are similar to the Hyllerass basis functions with the Slater-type exponentials replaced by fully correlated gaussian type exponentials.
44
The above conclusion is supported by our recent prototype calculations for the vibrational structure of the/-/2 molecule, which was done with the above-described methodology [68]. However, instead of including all four particles (two electron and two protons) in the calculations, we only considered the nuclei interacting with the potential obtained by analytical gaussian fit to the B-O energy values of Kolos and Wolniewicz [70]. The most interesting feature of the results was a dominating contribution from higher powers of the H - H internuclear distance to all vibrational levels. For example, when only one Ck was used in the expansion, the lowest ground state vibrational energy was obtained with the r = r17e-43496r2. This result suggests, that constructing non-adiabatic wave functions for molecules, one needs to use Ck's with high powers of rij t o describe relative motion of nuclei. The superposition of these types of functions with functions describing the relative motion of the correlated electrons in the attractive field of nuclei, which we obtain in B-O calculations using correlated gaussians [71-75], will be used as the starting wave function in the fully non-adiabatic molecular calculations. 7. S U M M A R Y In this review we focused on a practical approach allowing quantum-mechanical description of the dynamics of the collective motion of nuclei and electrons in molecular systems. This approach allows investigation of the chemical bonding as a dynamical phenomenon which includes electrostatic, induction, charge-transfer and dispersion effects due to both electrons and nuclei. In this the procedure differs from the conventional approach, which treats the chemical bonding as an electron phenomenon. It is clear that more work is needed to develop a computational procedure, which can be used to study dynamics of the chemical bonding in the way described in this review. Application of explicitly correlated gaussians in one of the most promising approaches, which can lead to practical applications. REFERENCES
1. L. Pauling, in Foundations of Physics vol.22, no.6 (1992) p829-38. 2. M. Born and J. P. Oppenheimer, Ann. Phys. 84 (1927) 457. 3. J.M. Cobes and R. Seiler, Quantum Dynamics of Molecules, ed. R. G. Woolley, Plenum Press, New York, 1980, p. 435. 4. M. Klein, A. Martinez, R. Seiler and X. Wang, Commun. Math.Phys. 143 (1992) 607. 5. M. Born and K. Huang, Dynamical Theory of Cristal Lattices, Oxford University Press, 1955, Appendix 8. 6. J.C. Slater, Proc.Nat.Acad.Sci. 13 (1927) 423. 7. B.H. Lengsfield and D.R. Yarkony, Adv.Chem.Phys. 82 (1992) 1. 8. A. Carrington and R.A. Kennedy, Gas Phase Ion Chemistry; Ed. M.T. Bowers, Academic Press, New York, vol.3, p.393. R.D. Poshusta, Int.J.Quantum Chem. 24 (1983) 65. I0. D.B. Kinghorn and R.D. Poshusta, Phys.Rev. A 47 (1993) 3671. ii. D. B. Kinghorn, Int.J.Quantum Chem. 57 (1996) 141. 12. D.B. Kinghorn and R.D. Poshusta, Int.J.Quantum Chem. (1995), in press. .
45 13. G.W.F. Drake, Atomic, Molecular, and Optical Physics Handbook, American Institute of Physics, Woodbury, New York, 1996. 14. P.M. Kozlowski and L. Adamowicz, J.Chem.Phys., 95 (1991) 6681. 15. P.M. Kozlowski and L. Adamowicz, J.Comput.Chem., 13 (1992) 602. 16. P.M. Kozlowski and L. Adamowicz, J.Chem.Phys., 96 (1992) 9013. 17. P.M. Kozlowski and L. Adamowicz, J.Chem.Phys. 97 (1992) 5063. 18. P.M. Kozlowski and L. Adamowicz, Phys.Rev. A 48 (1993) 1903. 19. P.M. Kozlowski and L. Adamowicz, Chem.Rev. 93 (1993) 2007. 20. P.M. Kozlowski and L. Adamowicz, Int.J.Quantum Chem. 55 (1995) 245. 21. P.M. Kozlowski and L. Adamowicz, Int.J. Quantum Chem. 55 (1995) 367. 22. P.M. Kozlowski and L. Adamowicz, J.Phys.Chem. 100 (1996) 6266. 23. S. Gartenhaus and C. Schwartz, Phys.Rev.A 108 (1957) 482. 24. S.F. Boys, Proc.R.Soc. London Ser. A258 (1960) 402. 25. K. Singer, Proc.R.Soc. London Ser. A258 (1960) 412. 26. H.F. King, J.Chem.Phys. 46 (1967) 705. 27. W.A. Lester, Jr. and M. Krauss, J.Chem.Phys. 41 (1964) 1407; 42 (1965) 2990. 28. J.V.L. Longstaff and K. Singer, Proc.R.Soc. London, Ser. A, 258 (1960) 421. 29. J.V.L. Longstaff and K. Singer, Theoret.Chim.Acta 2 (1964) 265. 30. J.V.L. Longstaff and K. Singer, J.Chem.Phys. 42 (1965) 801. 31. N.C. Handy, Mol.Phys. 26 (1973) 169. 32. L. Salmon and R.D. Poshusta, J.Chem.Phys. 59 (1973) 3497. 33. K. Szalewicz and B. Jeziorski, Mol. Phys. 38 (1979) 191. 34. B. Jeziorski and K. Szalewicz, Phys.Rev. 19 (1979) 2360. 35. W. Kolos, H. J. Monkhorst and K. Szalewicz, J.Chem.Phys. 77 (1982) 1323; 77 (1982) 1335. 36. K.C. Pan and H.F. King, J.Chem.Phys. 56 (1972) 4667. 37. L. Adamowicz and A.J. Sadlej, J.Chem.Phys. 67 (1977) 4298. 38. L. Adamowicz and A.J. Sadlej, Chem.Phys.Lett. 48 (1977) 305. 39. L. Adamowicz, Int.J.Quantum Chem. 13 (1978) 265. 40. L. Adamowicz, Acta Phys.Pol. A 53 (1978) 471. 41. L. Adamowicz and A.J. Sadlej, J.Chem.Phys. 69 (1978) 3992. 42. L. Adamowicz and A.J. Sadlej, Acta Phys.Pol. A 54 (1978) 73. 43. L. Adamowicz and A.J. Sadlej, Chem.Phys.Lett. 53 (1978) 377. 44. K. Szalewicz, B. Jeziorski, H.J. Monkhorst and J.G. Zabolitzky, J.Chem.Phys. 78 (1983) 1420. 45. K. Szalewicz, B. Jeziorski, H.J. Monkhorst and J.G. Zabolitzky, J.Chem.Phys. 79 (1983) 5543. 46. B. Jeziorski, H.J. Monkhorst, K. Szalewicz and J.G. Zabolitzky, J.Chem.Phys. 81 (1984) 368. 47. K. Szalewicz, J.G. Zabolitzky, B. Jeziorski and H.J. Monkhorst, J.Chem.Phys. 81 (1984) 2723. 48. K.B. Wenzel, J.G. Zabolitzky, K. Szalewicz, B. Jeziorski and H.J. Monkhorst, J.Chem.Phys. 85 (1986) 3964. 49. D. Feldmann, Phys.Lett. 53 A (1975) 82. 50. K.E. McCulloh and J.A. Walker, Chem.Phys.Lett. 25 (1974) 439.
45 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65.
66. 67. 68. 69.
70. 71. 72. 73. 74. 75.
W.A. Chupka, P.M. Dehmer, and W.T. Jivery, J.Chem.Phys. 63 (1975) 3929. L.R. Scherk, Can.J.Phys. 57 (1979) 558. K.R. Lykke, K.K. Murray, and W.C. Lineberger, Phys.Rev. A 43 (1991) 6104. C.L. Pekeris, Phys.Rev. 126 (1962) 1470. K. Frankowski and C.L. Pekeris, Phys.Rev. 146 (1966) 46. G.W.F. Drake, Nucl.Instr.Meth.Phys.Res. B 31 (1988) 7. J.D. Baker, D.E. Freund, R.N. Hill, and J.D.M. III, Phys.Rev. A 41 (1990) 1247. A.J. Thakkar and T. Koga, Phys.Rev. A 50 (1994) 854. J. Ackermann, Phys.Rev. A 52 (1995) 1968. B.W. Petley, Phys.Scr. T40 (1992) 5. G. Audi and A.H. Wapstra, Nucl.Phys. A 565 (1993) 1. H.A. Bethe and E.E. Salpeter, Quantum Mechanics of One- And Two-Electron Atoms (Plenum, New York, 1977). S.G. Nash, SIAM J. Numer. Anal. 21 (1984) 770. netlib can be accessed by ftp at [email protected] and [email protected] or by World Wide Web access at http://www.netlib.org. D.B. Kinghorn, Explicitly Correlated Gaussian Basis Functions: Derivation and lmplementation of Matrix Elements and Gradient Formulas Using Matrix Differential Calculus, Ph.D. dissertation, Washington State University, 1995. K.Varga and Y.Suzuki, Phys.Rev. A 52 (1995) 2885. K. Varga and Y. Suzuki, Phys.Rev. C (1996), in press. D.B. Kinghorn and L. Adamowicz, J.Chem.Phys. in press. L.C. Biedenharn and J.D. Louck, Angular Momentum in Quantum Physics. Theory and Application, Encyclopedia of Mathematics and Its Applications, Addison-Wesley, Reading, MA, 1981. W. Kolos and L. Wolniewicz, J.Chem.Phys. 43 (1965) 2429. E. Schwegler, P.M. Kozlowski and L. Adamowicz, J.Comp.Chem. 14 (1993) 566. Z. Zhang, P.M. Kozlowski and L. Adamowicz, J.Comp.Chem. 15 (1994) 54. Z. Zhang and L. Adamowicz, J.Comp.Chem. 15 (1994) 893. Z. Zhang and L. Adamowicz, Int.J.Quantum Chem. 54 (1995) 281. D.W. Gilmore, P.M. Kozlowski, D.B. Kinghorn and L. Adamowicz, Int.J.Quantum Chem., accepted for publication.
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
47
Pauling's Legacy: Modern Modelling of the Chemical Bond
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
T h e M i l l s - N i x o n Effect: Fallacies, F a c t s a n d C h e m i c a l R e l e v a n c e Zvonimir B. Maksid a , Mirjana Eckert-Maksid b, Otilia M5 c and Manuel Ys
c
~Quantum Organic Chemistry Group, Division of Organic Chemistry and Biochemistry, Rudjer Bo~kovid Institute, P.O.B 1016, 10001 Zagreb, Croatia and Faculty of Science and Mathematics, The University of Zagreb, Marulidev trg 19, 10000 Zagreb, Croatia bphysical Organic Chemistry Laboratory, Division of Organic Chemistry and Biochemistry, Rudjer Bo~kovid Institute, P.O.B 1016, 10001 Zagreb, Croatia CDepartamento de Qu{mica C-9, Universidad Aut6noma de Madrid, Cantoblanco, 28049-Madrid, Spain
1. I n t r o d u c t i o n The Mills-Nixon (MN) effect has a long, respectable but controversial history because it has been praised, questioned and disputed over decades culminating sometimes in heated arguments and debates. Although the MN effect is only a very small episode, it is the case interesting for the history of science and perhaps for the science on science. It possesses all ingredients neccessary for a thrilling saga about a winding path of the research process and its stepwise progress. This story begins with an interesting finding accompanied by a bold working hypothesis, followed subsequently by some evidence which supported the original experimental findings. These results were ensued by attempts to invalidate or to dismantle the pioneering discovery by focusing only on the wrong initial premise, by a "burial" of the original work and simultaneous "rediscovery" of the same effect albeit in a somewhat disguised form. The entire process was blended with misconceptions, charming naivety of some researches and/or disregard of other contributions to the subject lying next to fraud and swindle. Fortunately, there are also some unpretentious contributions representing perhaps small but signifficant improvements of the existing knowledge, which raise a hope that things will be settled as time elapses. It is the aim of this chapter to review the main achievements some seventy years after the original paper of Mills and Nixon in 1930 [1] concentrating on the structure, properties and reactivity of aromatic compounds annelated to strained rings. Last but not least, some common fallacies are discussed in order to solve this conundrum. It is beyond the size and scope of this chapter to provide a comprehensive coverage of the field. We apologise, therefore, for unintentional omission of some papers, which would otherwise have been mentioned.
48 2. The M i l l s - N i x o n Effect" The First E x p e r i m e n t a l Result and T h e o r e t i c a l I n t e r p r e t a t i o n by S u t t o n and Pauling Mills and Nixon found that an annelated small ring had directional capability in the electrophilic substitution reactions taking place at the benzene fragment [1]. More specifically, they established that the free/3'-position in fl-hydroxyindan is more susceptible to electrophilic substitution than the c~-site (Fig. 1) as revealed by bromine or diazo group substitutions. The opposite should be the case in/3-hydroxytetralin.
H2)n
(a)
(b)
(c)
Figure 1. (a) Schematic representation of/~-hydroxybenzocycloalkenes. (b) Predominant Kekul~ electron pairing scheme in/3-hydroxyindan according to the MN hypothesis. (c) Preconceived dominating Kekul~ structure in/~-hydroxytetralin.
This intriguing regioselectivity induced by the fused carbocycle triggered a cascade of papers addressing the question of the electrophilic reactivity of annelated benzenes [2-7], which have continued to flow until nowadays [8-14]. An interpretation of this effect was put forward by Mills and Nixon first. It was based on the tetrahedral distribution of the carbon atom valencies. Consider bond angles around the carbon junction atom in indan as depicted in Fig. 2:
.125.3
o
Figure 2. Hypothesis about the "Kekul6-resonance structure fixation" in indan by Mills and Nixon
49
The angle in the five-membered carbocycle is close to 109.5 ~ leaving for the two remaining valence angles in the molecular plane 125.3 ~ each. The double bond is then given in the bent bond representation by overlapping of two pairs of the sp 3 hybrid orbitals. This simple model implies that the joint CC bond is described by sp3-sp 3 hybrids - a supposition which was proved to be essentially correct by later quantum mechanical calculations performed on many small ring fused aromatics. If this is the case, then the Kekul6 structure of the benzene fragment in/3-hydroxyindan shown in Fig. 2 should be preferred leading to its freezing and to a concomitant dramatic alternation of CC bond distances. However, Mills and Nixon erroneously concluded, employing this simple pictorial concept, that/3-hydroxyindan existed in a form of two isomers corresponding to the two possible Kekul6 spin-coupling schemes of the benzene fragment and that the one presented in Fig. 2 predominates in their mixture [1]. This interpretation is obviously wrong and outdated, but it is perfectly clear to almost everyone that this kind of rationalization of the experimentally observed data should be distinguished from the MN-effect itself. Generally speaking, a genuine phenomenon should not be completely identified with a model developed for its description - not to mention a possibility that the employed simplified model could be oversimplified, inadequate or completely wrong. The first correct interpretation of the selectivity in the electrophilic reactions of fused aromatics was given by Sutton and Pauling by the valence bond (VB) method in its most elementary form [15]. The total molecular wavefunction was described in terms of two VB components corresponding to Kekul6 structures of benzene. It appeared that the one related to the situation depicted in Fig. 2 possessed a larger coefficient and more weight as was intuitively expected. In performing actual calculations, some approximations about the distribution of the strain energy induced by the five membered ring have been invoked. It was concluded, owing to the substantial energy associated with resonance between two VB structures, that these changes caused by strain should be relatively small. Nevertheless, they were large enough to account for all experimental data available at that time thus providing the first quantum mechanical explanation of the MN-effect. 2.1. D e f i n i t i o n of t h e M N - E f f e c t a n d s o m e C o m m o n Fallacies It is useful to have at hand an operational definition of the MN-effect in order to avoid unnecessary semantic difficulties. We propose the following conceptually simple and intuitively appealing definition: The Mills-Nixon effect is a perturbation of the aromatic
moiety exerted by fusion of one (or several) nonaromatic angularly strained molecule(s). This perturbation is reflected in the characteristic partial bond localization leading to modifications of a number of physical and chemical properties of the aromatic moiety. Thus the notion of the Mills-Nixon effect is free of any predetermined underlying mechanism pertaining to the exerted perturbation. This is important because the mechanisms and manifestations of the effect may be different in various molecular systems. More precisely, the MN-effect is generally a result of an interplay of several types of intramolecular interactions. Their contributions vary from one family of compounds to another as a rule. It should be also emphasized that by angularly strained fragments, annelated to an aromatic nucleus, molecular systems strained by the ring closure requirements were tacitly understood. They are not necessarily small and/or monocyclic. Finally, a useful rule of thumb can be applied in identifying the Mills-Nixon systems. It is given by a diagnostic tool
50 provided by the length of the aromatic CC bonds placed ortho to the bond of coalescence of two rings. If they are shorter than in a free aromatic molecule, then the MN distortion takes place. It is important to realize that there are reversed MN-systems too, which require extension of the proposed definition (vide infra). It follows that the MN-effect is reflected in changes of diverse properties of the free aromatic compound upon annelation. Concomitantly, it is obvious that various experimental techniques and appropriate theoretical models are required to provide a comprehensive description of the observed effect. Unfortunately, there are numerous misconceptions of this phenomenon in the literature, which introduce a lot of confusion. We give here a short catalogue of the main fallacies and provide brief comments and glosses in order to avoid unnecessary misunderstandings: (1) Mills and Nixon based their interpretation on the Kekul(! time averaged oscillating model of benzene. This picture is obsolete, since it was conclusively shown subsequently that benzene had a single-well potential. The idea of rapidly equilibrating Kekul~ forms requires double-well potential, which has been abandoned in the meantime. Therefore, the MN-effect does not exist and this anachronistic structural proposal can be safely discarded [16]. There are two misconceptions in this line of thought. Firstly, the MN-effect is an experimental fact independent of its interpretation, which in turn could be right or wrong. The latter was indeed the case in the original Mills-Nixon milestone paper. Secondly, the MN-phenomenon cannot be reduced to a structural feature only, although fusion of small rings involves some significant geometric changes too. It is intuitively clear that fusion of (highly) strained ring(s) and an aromatic moiety inevitably leads to a change of a number of physical and chemical properties of the latter.
(2)
A double-well shape of the potential energy surface of benzene and its annelated derivatives would imply that even small perturbations would cause large changes in populations of the valence tautomers each confined in its own potential well. Therefore, small perturbations would have far reaching consequences. In contrast, a single-well potential is robust thus being less sensitive to various perturbations. Concomitant with the single-well picture, the characteristic features of small ring annelated benzenes like tris-cyclobutabenzene do not exhibit any appreciable bond alternation: specifically, the bond lengths in the central ring of this compound are 1.40 and 1.38 A. This small difference can be neglected [16]. In another study Boese et al. [17] discussed the matter claiming that bond alternations induced in benzene by annelation with small rings, if present at all, are very minor (<0.025 A) thus being chemically insignificant. Consequently, they should not be taken as support for Mills-Nixon arguments [17]. Differences in CC bond lengths of the aromatic moiety, which are less than 0.025 A are indeed small if compared to distances of 1.48 A and 1.33 A characterising C ( s p 2 ) C(sp 2) single and C(sp2)=C(sp 2) double bonds, respectively, as usually pointed out by X-ray researchers. However, they are quite significant for many properties which critically depend on interatomic distances like e.g. molecular quadrupole moments, electric field gradients, spin-spin coupling constants of directly bonded atoms etc. A
51 fact that the X-ray technique is not a very accurate method for estimating molecular spatial structure is not a good reason for putting a telescope on one's blind eye and claiming that differences do not exist or that they are unimportant. If errors in the CC bond distances of the order of 0.025 A are acceptable in the X-ray investigations, they are too large and unacceptable for many other experimental methods and observables, which are sensitive to the molecular geometry. Such low precision is certainly not acceptable and satisfactory for the modern quantum theory of molecular structure.
(3)
The Mills-Nixon effect is a consequence of the angular strain and the accompanying a-rehybridization of the junction carbon atoms [18]. In other words a - and 7rcontributions should be dissected and treated separately. Additionally, the a-effect is perhaps dramatic in some nonexisting model systems, where severe bond bending takes place [19], but it is rather small in real fused molecular systems, where bentbonding occurs to a much lesser extent [20]. The fact of the matter is that MN-distortions of the aromatic nucleus are affected by rehybridization, ~-electron conjugation and/or hyperconjugation. These components are intermingled and they are not always easily resolved and separated. We have been able to show that appearance of bent bonding did not invalidate arguments in favor of the MN-alternation of CC bonds in fused benzenes and their fluoroderivatives [21].
(4) The MN-effect involves partial It-electron bond fixation. If so, it would be "visible" in some carefully designed NMR experiments. Mitchell et al. [22] focused their attention to dimethylhydropyrene 1 (Fig. 3):
Figure 3: Two equivalent resonance structures of dimethylhydropyrene 1
which is well represented by two equivalent resonance structures. The chemical shifts of the internal methyl groups are very sensitive to the degree of the localization in the macrocyclic ring. Now, two cyclobutene rings can be annelated to 1 in meta fashion yielding predominant structure 2 in accordance with the MNhypothesis (Fig. 4). The second possible resonance structure would operate against the supposed MN-effect and is neglected in further considerations. On the other
52
2
3a
3b
Figure 4. Relevant resonance structures of dimethylhydropyrene 1 which is either meta or para diannelated with cyclobutane rings.
hand, para fusion of cyclobutanes provides molecular system 3, which can be described with two equivalent resonance structures 3a and 3b. To make a long story short, 2 should exhibit some 7r-electron localization whereas 3 should not. Hence a difference in chemical shifts of the internal methyl protons in systems 2 and 3 would provide a conclusive evidence about existence or nonexistence of the MNeffect. Analysis of the NMR data has led to a conclusion that there is no signifficant difference in the 7r-electron density distribution in 2 and 3. Consequently, the MN-effect does not exist. This experiment was repeatedly used as a final resolution of the MN dilemma. It is a very elegant piece of work, but the problem is that the four-membered cyclobutene ring cannot induce a significant deformation of sizeable molecular skeleton such as that involved in 1. In particular, the second resonance structure of the isomer 2, neglected by Mitchell et al. [22] in their deliberations, does in fact contribute considerably to the resonance effect and subsequent strong @localization in the meta bisannelated compound 2. Consequently, the result of this experiment is not surprising. It is very important to avoid jumping to conclusions and it is particularly dangerous to make sweeping generalizations on the basis of a single observation. We note in passing also that measurements of the JJ(H-C .... C - H ) proton-proton spinspin coupling constants were not sensitive enough to detect the MN effect even in benzocyclopropene [23,24]. In conclusion, the moral of this story is that one should define very clearly relevant questions related to the phenomenon under study and then select adequate techniques and/or methods able to provide reliable answers. It is also of crucial
53 importance to understand the underlying physical picture and develop a good theoretical model capable of providing results in harmony with verified experimental data offering their interpretation at the same time. 3. S t r u c t u r a l Consequences of the M N - E f f e c t We shall commence our review of relevant experimental and theoretical results by drawing attention to the structure of the most famous molecule of this decade - C60 fullerene (Fig. 5):
Figure 5. B a l l - a n d - stick model of buckminsterfullerene C60.
The salient structural pattern of this fascinating molecule is that it has five-membered rings each surrounded by a corona of five benzene fragments. One can distinguish two types of CC bonds: the ipso (endo) bond, where two different rings coalesce, and ortho (exo) bond, which radiates away from the five-membered ring. The difference in their lengths is highly pronounced: d(ipso)=l.458 [1.455] A and d(ortho)=l.410 [1.391] A, where the first figure refers to the electron diffraction result [25], whilst numbers given within squared parentheses are X-ray data [26]. NMR experiments also discriminate between two widely different ipso and ortho bond distances of 1.450 and 1.400 A, respectively [27]. These bond distances are apparently in harmony with the MN-type of distorsion of the benzene ring (vide infra).
3.1. The Role of Rehybridization It is a common wisdom that small ring compounds possess bent bonds implying that maxima of their electron density are outside the ring. Concomitantly, local hybrid orbitals deviate from the straight lines passing through the neighbouring nuclei, which has some remarkable chemical consequences [28-31]. For example, small rings exhibit high angular strain energies which increase their reactivity. It is credible to assume that a fraction of
54 the strain energy is spilled-over from the small ring to the aromatic fragment in fused systems. It is gratifying that one can design a strained model compound, which mimicks rather nicely fusion of three cyclobutane-rings to the benzene moiety. This is achieved by simultaneous bending of vicinal C-H bonds towards each other so that they assume an angle of 94 ~ found in cyclobutabenzenes (Fig. 6). It appears that this deformed ben-
H
F
4a
4b
Figure 6. bah deformed benzene 4a and its perfluoro-derivative 4b.
zene reproduces the salient features of the real molecular system in a transparent and satisfactory manner [11,14,19,20,32], which can be easily understood in terms of the rehybridization [31,33,34] and concomitant 7r-electron density shifts within this molecular model systems [35]. Results presented in Table 1 reveal that forced distorsions cause a shift of the s-character from the ipso(i) C(1)-C(2) to the ortho(o) C(2)-C(3) bond. Interestingly, changes in the a-molecular plane induce redistribution of 7r-bond density, which is increased in (o)-bonds and depleted in the (i)-bond in harmony with rehybridization. Synergistic action of a - and 7r-electrons produces alternation of CC bond lengths exactly in the sense of the MN-effect. Furthermore, the (i)-bond in deformed benzene 4 a ( a - 9 0 ~ is described practically by sp3-sp 3 hybridization in accordance with the tetrahedral carbon atom model used by Mills and Nixon. The induced variation in bond distances is nicely reproduced by LSF (least square fit) correlations, against the 7r(CC) bond orders and the average s-characters (Fig. 7). In both cases the functional dependence is quadratic. They read:
d(CC) =
0.007(s%) 2 - 0.056(s%) + 2.6
A
(1)
d(CC) =
1.253(7rbo)2 -- 2.138(7rbo)+ 2.3
h
(2)
The quality of these two correlations is very good as evidenced by the average absolute deviations from the HF/6-31G* CC bond lengths being only 0.003/~. The shape of curves
55 Table 1 Bond distances (in /~) NBO s-characters (in %), LSwdin's n-bond orders and L5wdin atomic charges of deliberately distorted benzene and perfluorobenzene as obtained by the MP2(fc)/6-31G* model. Lbwdin Population Molecule Bond Distance s-character n-bond order A QA 4a(a = 120 ~ CC 1.397 35.1-35.1 0.66 C 6.16 CH 1.087 29.6-100.0 H 0.84 4a(a = 110 ~ CO(i) 1.414 33.3-33.3 0.63 C 6.17 CC(o) 1.385 36.9-36.9 0.68 H 0.84 CH 1.087 29.7-100.0 4a(a = 100 ~ CC(i) 1.446 31.0-31.0 0.59 C 6.17 CC(o) 1.374 38.7-38.7 0.72 H 0.83 CH 1.087 30.I-I00.0 4a(c~ = 90~ C C ( i ) 1 . 5 1 5 27.4-27.4 0.49 C 6.18 CC(o) 1 . 3 5 7 41.1-41.1 0.80 H 0.82 CH 1.083 31.4-100.0 4 b ( a = 120 ~ CC 1.393 37.8-37.8 0.61 C 5.84 CF 1.341 24.1-30.5 0.23 F 9.16 4 b ( a = 110 ~ CC(i) 1.431 34.9-34.9 0.51 C 5.84 CC(o) 1.370 40.5-40.5 0.71 F 9.16 CF 1.343 24.3-30.3 0.23 4 b ( a = 100 ~ CC(i) 1.557 30.0-30.0 0.33 C 5.84 CC(o) 1.340 44.4-44.4 0.82 F 9.15 CF 1.343 25.4-29.8 0.24 * A and Qa denote the atom in question and its total electron density, respectively.
presented in Fig. 7 suggests a linear dependence of n-bond orders on the average scharacter. This turns out to be true as revealed by the LSF procedure:
(nbo) =
0.021(S%) -- 0.1
(3)
where the average absolute error of (nbo) estimated by relation (3) is only 0.01. It goes without saying that the correlation between the n-bond orders and average s-characters of CC bonds is expected in view of the rehybridization mechanism induced by benzene deformation. The asymmetry in the CC bonds can be further enhanced by the fluorination effect. According to the Walsh-Bent rule [36,37] an electronegative atom prefers carbon hybrid AO possessing increased p-character. Hence, the remaining s-content is distributed over (o)- and (i)-CC bonds in a way that their difference is amplified (Table 1). It should be pointed out that there are relations in perfluorobenzene, which are analogous to those in distorted benzenes (1)-(3), but possessing different fitting parameters. This is not surprizing because even in undeformed perfluorobenzene the nbo value is 0.61 as compared to 0.66 in the parent benzene. Apparently, correlations (1)-(3) are useful in revealing some general basic interrelations, but they differ in their explicit expressions in
56
"~, 1.52 ~-" 1.5 1.48 1.46
1.44 1.42
1.4 1.38 1.36
28
30
32
34
38
36
(a)
40
s (in percents)
41.52~ ~"
1.5 1.48
1.46 -'1.44 .-1.42
:--
1.4 .-1.38 ~ 1.36
I
I
0.5
0.55
,
I
0.6
,
I
0.65
0.7
0.75
0.8
pi bond order
(b)
Figure 7. The least-square fit quadratic curves relating the CC bond distances in the D3h distorted benzene to the average s-character (a) and the r - b o n d order (b).
different families of compounds. To summarize: deformed benzene and perfluorobenzene represent two clear-cut cases which underline the importance of rehybridizaton in annelated systems. They also convincingly show that reorganization of the 7r-electron density can be triggered by redistribution of the CC s-characters which takes place in the a-plane, although it is addmitedly true that a part of lengthening of the ipso bonds is due to repulsion between a pairs of two close vicinal H or F atoms.
57 Another very illuminating example which illustrates the crucial role of rehybridization is provided by all-cis-tris(benzocyclobuta)-cyclohexane [38] depicted in Fig. 8. The
Figure 8. Schematic representation of all-cis-tris(benzocyclobuta)-cyclohexane and numbering of relevant carbon atoms.
molecule resembles a rose with petals given by three benzocyclobutenes all being placed up in the molecular crystal relative to the plane of the central cyclohexane ring. A striking feature of this interesting compound is a completely flat central cyclohexane moiety possessing remarkably strong alternation of CC bond lengths. The ipso C(1)C(2) and ortho C(1)-C(I') bond lengths are 1.599(1.595) and 1.511(1.491) .~, respectively, where numbers within parentheses refer to AM1 calculations [39], whilst preceeding figures are X-ray values [38]. It is noteworthy that the difference in d(CC)i-d(CC)o = 0.09(0.10) A is highly significant and that theory and experiment are in good agreement. A very long (i)-bond has low s-character (21.4%-21.4%) as compared to 27.1%-27.1% rehybridization in (o)-bonds. It is also important to mention that the energy partitioning technique indicates substantial differences in bond strengths, the ortho-bond being much stronger [39]. Consequently bond alternation is induced exclusively by the a-framework due to rehybridization induced by fusion of cyclobutane rings. Since 7r-electrons are missing in the central ring, this molecule offers convincing evidence that the rehybridization effect is of paramount importance. 3.2. T h e Role of ~ - D e l o c a l i z a t i o n It would be of interest to find a fused system where CC bonds have the same hybridization and the partial bond fixation is produced by 7r-localization only. There is, fortunately, such a molecule which is shown in Fig. 9. Although triphenylene itself does not represent a MN-system in a sense of the definition given in section 2.1 belonging actually to
58
1.402(1.400)A~ ~'
t
~
- 1.374(1.385) 1.406(1.413) = 1.459(1.463),~
.
.
?~
Figure 9. Relevant CC distances in triphenylene obtained by the X-ray analysis corrected for thermal motion [40]. The MP2(fc)/6-nlG* results are given within parentheses [43].
compounds exhibiting reversed MN ~-electron bond fixation, it provides a good illustrative example for bond localization in polycyclic planar systems. The X-ray and neutron diffraction analyses of triphenylene [40] reveal two kinds of CC bonds in the central benzene ring. The annelated (i)-bonds are considerably shorter (1.411 A) than (o)-bonds (1.459/~) - a feature which Baldridge and Siegel found puzzling [18]. They argued that the fused bonds were shorter because the annelated peripheral benzenes were aromatic (4n+2)7~ systems. This explanation is vague because distal benzene fragments exhibit some localization and bond alternation too. The fact of the matter is that triphenylene is better described as if it were composed of three naphthalene moieties coalesced in the central ring. Naphthalene itself exhibits ~-electron localization by forming distal partial cis-1,3-butadiene patterns as revealed both by X-ray measurements [41] and theoretical calculations [42] (Fig. 10). It appears that each of the twin-benzene fragments tends to preserve its aromaticity by localizing the other ring in the cis-l,3-butadiene fashion thus producing characteristic naphthalene bond fixation pattern (Fig. 10). This picture is supported by bond distances obtained by the scaled (HF/3-21G)sc and (HF/6-31G*) models and the corresponding ~-bond orders [42]. We note in passing that MP2(fc)/631G* and MP3(fc)/6-31G* calculations [43] give bond distances similar to those obtained by the HF/6-31G* model. This is evidenced by the following data: d(C(1)-C(2)) = [1.380, 1.373, 1.371], d(C(2)-C(9))= [1.419, 1.424, 1.422], d(C(9)-C(10))= [1.432, 1.422, 1.420] and d ( C ( 4 ) - C ( 5 ) ) = [1.415, 1.420, 1.412] in/~, where triads of numbers correspond to MP2, MP3 and X-ray [41] structural parameters, respectively. It is also interesting to mention that variation in the ~-electron density induces here some - albeit very small - rehybridization changes. For instance, C(1)-C(2) and C(9)-C(10) bonds are described by the s-characters of 33.4%-33.8% and 32.7%-32.7%, respectively, whereas the average s-content of the most peripheral C(1)-C(8) bond assumes 31.6%. Since the central benzene in triphenylene is a part of three naphthalenes, it is not surprising that its orthobonds are roughly three times more stretched than the C(2)-C(9) bond in the parent
59
2
3
1.429,~
~
s,~ 7
I
~
1.374~
II1.420.~ [ 1.425.~
(o.s3)
6
(7a)
(7b)
(7c)
Figure 10. Schematic illustration of a tendency of each benzene fragment in naphthalene to retain its aromaticity by producing cis-l,3-butadiene partial localization in its twin-ring as described by the resonance structures (7a) and (Tb) yielding the resulting predominant canonical structure (7c). This intuitive argument is supported by the (HF/6-31G*) bond distances and the corresponding x-bond orders given within parentheses.
naphthalene (relative to benzene itself). Additionally, the (i)-bonds of the central ring in triphenylene are shorter than in a free naphthalene reflecting a collective effect of the three peripheral rings. We would like to reiterate that a mode of localization in this molecule is characteristic for the so called reversed MN effect (see later). It is noteworthy that the peripheral rings exhibit significant alternation of CC bond lengths inspite of their alleged aromaticity. It follows that 7~-electrons can themselves produce significant bond fixation even if angular distortions and the accompanying Baeyer strain are absent [44]. Apparently, fused systems exhibit a wide spectrum of electronic features which should be very carefully examined.
3.3. Paradigmatic Indan and Tetralin Cases We shall briefly discuss structural features of paradigmatic molecules indan(8) and tetralin (9), which are compared with the angular strain-free compound o-dimethylbenzene (10). They are schematically depicted in Fig. 11. The MP2(fc)/6-31G* and B3LYP/631G* results [43] are presented in Table 2. It appears that annelation of the five membered ring induces tiny changes within the benzene fragment. In this connection it is useful to mention that the CC-bond length in free benzene is 1.395/~ and 1.397 ~ by MP2(fc)/631G* and B3LYP/6-31G* models, respectively. It appears that the (i)-bond in indan is longer than (o)-bonds by 0.002-0.003/~ which is a very tiny difference indeed. It is noteworthy, however, that C(1)-C(2)-C(9) angle of the five membered ring is 111.5 .~ being very close to the tetrahedral value as assumed by Mills and Nixon. One should observe that variation in the bond angles in the benzene ring lies within limits 120 ~ • 1~ Interestingly, the X-ray data [45] show a more pronounced difference in bond distances d(CC)i-d(CC)o = 1.393/~-1.382/~ = 0.01 s which is very well reproduced by the simple HF/3-21G model (~ 0.01 A) [46]. Structural features of trisannelated benzenes will be discussed in more detail in section 3.6. We just note here in passing that the HF/3-21G* model predicts for triscyclopentabenzene slightly stronger bond alternation as intuitively expected. The (i)- and (o)-bond distances are 1.393/~ and 1.375 [46] /~, respectively,
60
:
6 ~
7
3
9 (8)
8
6
7
3
10
45~ (9)
89
45~CH36 3
CH3
(10)
Figure 11. Charts of indan (8), tetralin (9) and o-dimethylbenzene (10) and numbering of heavy atoms (not using the IUPAC convention).
Table 2 Selected structural parameters of indan 8 and tetralin 9 as obtained by the MP2(fc)/631G* and B3LYP/6-31G* procedures (distances in A, angles in degrees). Mol. Bond/Angle MP2 B3LYP Mol. Bond/Angle MP2 B3LYP 8 C(1)-C(2) 1.398 1.399 9 C(1)-C(2) 1.407 1.406 C(2)-C(3) 1.396 1.396 C(2)-C(3) 1.403 1.403 C(3)-C(4) 1.397 1.397 C(3)-C(4) 1.392 1.393 C(4)-C(5) 1.400 1.399 C(4)-C(5) 1.397 1.397 C(1)-C(7) 1.508 1.513 C(1)-C(7) 1.520 1.514 C(7)-C(8) 1.553 1.559 C(7)-C(8) 1.534 1.526 C(1)-C(2)-C(3) 120.4 120.3 C(8)-C(9) 1.533 1.526 C(2)-C(3)-C(4) 119.2 119.4 C(1)-C(2)-C(3) 119.0 119.0 C(1)-C(2)-C(9) 111.5 111.5 C(2)-C(3)-C(4) 121.5 121.4 C(1)-C(7)-C(8) 104.8 104.9 C(1)-C(2)-C(10) 121.5 121.5
implying a difference of 0.02/~. Unfortunately, the corresponding X-ray results are not very precise being 1.395+0.01 A and 1.387+0.01A so that no definite conclusion can be drawn on the basis of experimental data [47]. Tetralin exhibits tiny geometric changes in the completely opposite direction leading to the reversed or anti-MN pattern. For instance, the annelated CC bond is longer than the ortho bonds by 0.004/~, whilst the central bond angle in the benzene moiety is larger than 120~ ~ in contrast to its counterpart in 8 (119.2~ Although all these structural variations in 8 and 9 are probably too small to be of relevance for X-ray crystallographers, they are significant enough to cause quite different behavior of these two antipodic compounds 8 and 9, to mention only their regioselectivity in the electrophilic substitution reactions [10-14]. Finally, in ortho-xylene 10 the MP2(fc)/6-31G* model gives CC bond distances 1.410, 1.399, 1.396 and 1.394/~ for the ipso(i), ortho(o), meta(m) and para(p) bond distances, respectively. It follows that hyperconjugation of two CH3 groups does not significantly influence CC bonds of the aromatic ring. A small lengthening
51 of the (i)bond by 0.01 .~ is best understood by the steric crowding.
3.4. The Ring Size Effect If the size of an annelated ring decreases, its angular strain increases due to a more pronounced bending of local hybrid orbitals. It is credible, therefore, to expect larger distortions of the benzene fragment. This turns out to be true as evidenced by benzocyclobutene 11 and benzocyclopropene 12 (Fig. 12). Their theoretical structural parame-
6
~2 3 11
H
.....'H
~~2 3
~'"'H
12
Figure 12. Schematic representation of benzocyclobutene 11 and benzocyctopropene 12 and numbering of atoms.
ters are compared with experimental data in Table 3. Some preliminaries are necessary, however, since it is well established by now that the constraints of the small ring induce bond angle strain in the benzene fragment [31,48-51]. Bending of hybrid orbitals gives rise to two distinct definitions of the bond distances of directly bonded atoms in molecules: (a) interatomic bond distance (IBD) corresponding to the straight line passing through the directly linked nuclei and (b) the bond path length (BPL) related to a ridge of maximum electron density (MED) between a pair of bonded atoms [52-54]. For strained rings BPL values are clearly larger than the IBD counterparts. Analogously, one can distinguish geometrical bond angles and those defined by MED lines emanating from a common nucleus. The latter are of course closer to the intuitive concept of the valence bond angles used in descriptive chemistry. A difference between BPL and IBD has led Stanger [20] and Boese et al. [17] to argue that the ring strain effect was compensated in the aromatic fragment by the bent bonds and that use of the IBD values in discussing the MN effect might be misleading. We have shown, however, that the interatomic bond distances (IBDs) can be safely used instead of BPLs for all CC bonds of the aromatic perimeter with one notable exception: the fused ipso bond of the annelated three-membered ring [21]. In the latter case the annelated bond is strongly bent inside the benzene ring possessing little semblance with the CC bond of a free benzene thus requiring a special scrutiny (vide infra). Consequently, we shall use IBDs in our discussions from now on, if it is not stated otherwise. Survey of data presented in Table 3 reveals that the simple HF/6-31G model gives structural parameters [48,50] in good accordance with the X-ray measurements [49,51]. There is alternation of bond distances in 11 compatible with the MN effect. More specif-
52 Table 3 Selected structural parameters of benzocyclobutene 11 and benzocyclopropene 12 as obtained by several theoretical models and X-ray technique (IBDs in/~ and angles in degrees). Bond/Angle Theory Exp. Mol. Bond/Angle Theory Exp. Mol. C(1)-C(2) 1.340 d 1.334 f 11 C(1)-C(2) 1.387 a 1.391 c 12 [1.396] b [1.352] ~ C(2)-C(3) 1.368 1.363 C(2)-C(3) 1.376 1.385
[1.389]
C(3)-C(4) 1.399 1.400 [1.403] C(4)-C(5) C(4)-C(5) 1.392 1.399 [1.405] C(1)-C(7) C(1)-C(7) 1.525 1.518 [1.518] C(7)-C(8) 1.582 1.576 [1.574] c(2)-c(1)-c(6) C(2)-C(1)-C(6) 122.3 122.3 [122.5] c(2)-c(3)-c(4) C(2)-C(3)-C(4) 116.1 116.0 [115.7] C(3)-C(4)-C(5) C(3)-C(4)-C(5) 121.4 121.7 [121.9] C(1)-C(2)-C(7) C(1)-C(2)-C(8) 93.7 93.5 [93.4] a HF/6-31G results of ref. [48]. b MP2(fc)/6-31G* structural parameters of ref. [43]. c X-ray data of ref. [49]. d HF/6-31G data of ref. [50]. e MP2(fc)/6-31G* geometry of ref. [21]. / X-ray estimates of ref. [51]. C(3)-C(4)
[1.382] 1.408 [1.409] 1.395 [1.408] 1.512 [1.503]
1.387
124.7 [124.6] 113.0
124.5
[113.0] 122.3 [122.5] 63.5 [63.3]
1.390 1.498
113.2 122.4 63.6
ically, a difference between IBDs of ortho and ipso bonds is 5oi = -0.011/~ (-0.006 ~), where the experimental estimate is given within parentheses. Analogously, 6too - 0.023 A (0.015 A) implying that meta bonds are longer than ortho ones as expected by the MN postulate. It is interesting to mention that Boese et al. [49] gave BPL values estimated by X-ray measurements, which differed very little from the interatomic distances (IBDs). They are as follows: I(C(1)-C(2)) = 1.41 /~, l(C(2)-C(3)) = 1.40 ]k, l(C(3)-C(4)) = 1.41 ]k and /(C(4)-C(5)) = 1.40 A implying that ortho and para bonds are shorter by 0.01 /~ as required by the MN localization pattern. Hence both IBDs and BPLs offer the same picture. Perusal of structural data of benzocyclopropene 12 presented in Table 3 shows that the shortest CC bond is C(1)-C(2). It would be erroneous, however, to conclude that it has
63 the highest ~-bond order. On the contrary, the largest concentration of the 7r-electron density is found in ortho and para bonds, which is coherent with the Mills-Nixon partial ~-bond fixation [50]. Inspite of that Apeloig and Arad pointed out that "the concept of bond fixation provides no help in understanding the geometry or the chemistry of cyclopropabenzene" [60]. The stumbling block was apparently the fused CC bond, which should not be compared to the CC bond of a free benzene. Instead, this part of a molecule, which is of crucial importance in the annelation process, can be simulated by a deformed cyclopropene, where H-C(1)=C(2) angle is forced to be 124.6 ~ (Table 4) in order to match the C(2)-C(1)-C(6) angle in 12 [61]. By squeezing the H-C(1)-C(2) angle to 124.6 ~ one
Table 4 Geometry of cyclopropene and its deliberately deformed structure, s-characters (in %) of hybrid orbitals and Lbwdin ~-bond orders. a
Bond
MP2 (fc) / 6-31G *
s-character
~rbo
150 ~ (equil.)
C(1)-C(2) C(1)-C(7)
1.303(1.296) a 1.507(1.509)
35.8-35.8 24.6-20.3
0.95 0.16
1.316 1.506
28.2-28.2 31.6-20.3
0.94 0.16
124.6 ~ C(1)-C(2) (model) C(1)-C(7) a MW data of Laurie et al [62].
induces rehybridization at the double bond between two sp 2 C atoms, which increases its p-character. Concomitantly, the C=C double bond distance is increased by 0.013 A. This estimate is realistic since the calculated structural parameters of cyclopropene itself obtained by the MP2(fc)/6-31G* model are in fine agreement with the MW data of Laurie et al. [62]. Hence, additional increase as large as 0.036/~ observed in 12 (Table 3) is due to a drift of the ~-electron density from the ipso to two ortho bonds thus following a shift in the s-character taking place at the carbon junction atoms. This is evidenced by a decrease in the 7r-bond order from 0.94 in deformed cyclopropene (Table 4) to 0.62 in 12 [61]. Concomitantly, ortho bonds are significantly shorter than meta bonds in 12, which is borne out by HF/6-31G, MP2(fc)/6-31G* and X-ray methods (Table 3). Appearance of bent bonds does not affect the conclusion that the MN effect is operative in benzocyclopropene 12 [21]. It would be useful to have at hand a quantitative estimate of the induced bond length alternation. We have introduced two very simple indices of bond localization related to the accompanying aromaticity defect [63]:
64 L~(d) - ~ Id~n) - d~cl
/A
(4)
n
and
Lm(~) - E lu~ ) - K~I
(5)
n m
where dcc and gc~ denote the average CC bond distance and the average 7r-bond order, respectively. Further, m stands for the molecule in question whereas n signifies a particular CC bond. Summation is extended over all CC bonds around the aromatic perimeter. In the case of benzocyclopropene the annelated bond should be exempted in view of its unusual bond distance and highly pronounced bent bond character. Clearly, Lm(d) and Lm(zr) are both zero for perfectly delocalized benzene. Their increase, on the other hand, reflects the presence of the partial 7r-electron localization and concomitant bond distance alternation. It is of some interest to mention a value of Lb(d) in a planar six-membered ring corresponding to the perfectly localized cyclohexatriene. It appears that it is 0.36 [63]. Utilizing MP2(fc)/6-31G* bond distances in benzocyclopentene, benzocyclobutene and benzocyclopropene one obtains Lb(d) values 0.008, 0.037 and 0.061, respectively. These values could be normalized per CC bond, but it is not necessary. We shall use the bond localization index Lb(d) as a measure of geometric alternation (i.e. anisotropy) of the CC bonds encompassing the benzene ring. It is clear that the bond fixation is largest in 12 and the least in benzocyclopentene as evidenced by Lb(d) values. This is in accordance with intuition and the angular strain argument. Variations in bond lengths found in 11 and 12 are sometimes considered insignificant. For example, Siegel et al. [55] determined the crystallographic significance of substituted benzenes in a following way: the length of CC bonds for a hexasubstituted benzene was 1.397/l~ with a standard deviation a - 0.009 A; a for all benzene derivatives is 0.013 A. Concomitantly, a difference between two CC bond distances (IBDs) greater than + 3 a is adopted as significant implying that it should be larger than +(0.027-0.039/it). This may well be so in crystallography, but these error limits are too large for other branches of chemistry not to mention modern quantum chemical models. We shall illustrate this point with just one example. It is given by early analyses of Dewar [56] and Stoicheff [57] of the dependence of CC bond distances on hybridization. A rule was found stating that CC single bonds can be classified into spn-sp m (n and m being integers 1,2 and 3) types and that rehybridization leads to a decrease of the CC distance by 0.04 tit when either n or m decreases by one. A more subtle relations between CC single bonds and the local hybridization indices can be deduced from the maximum overlap calculations [58]. A similar dependence on hybridization holds for C = C double bonds [59]. It follows as a corollary that allegedly negligible errors as large as 0.04 ~t are equivalent to change in hybridization of one C atom from sp a to sp 2 or from sp 2 to sp 1, which can be hardly characterized as insignificant. In fact, this kind of rehybridization has usually remarkable repercussion on the physical and chemical properties of organic compounds [31]. A word on angular deformations is in place here. It is obvious that hybrid orbitals cannot accomodate unusual bond angles around the carbon junction atoms. The most
65 dramatic angular distortions are found in benzocyclopropene 12, where C(6)-C(1)-C(7) and C(1)-C(2)-C(7) angles assume values of 172 ~ and 63.3 ~ respectively. Concomitantly, the C(2)-C(1)-C(6) angle is increased to 124.6 ~ in order to decrease bond bending. This effect causes alternation of bond angles in the benzene moiety in a typical domino fashion. Thus the apical C(2)-C(3)-C(4) angle is squeezed to a low value of 113.0 ~ whereas the peripheral C(3)-C(4)-C(5) angles are increased to 122.5 ~ Hence the angular strain is strictly speaking distributed over the whole system. Angular distortions in 8 and 11 are also present, but they are less dramatic (Tables 2 and 3). It follows t h a t t h e M i l l s - N i x o n effect influences b o t h m o l e c u l a r size a n d shape! An interesting family of compounds is provided by tensile-spring strained benzenes, where appreciable bond alternation takes place (Fig. 13). The X-ray data are avail-
6
8
4
4
11
3
3
13
14
15
16
Figure 13. Schematic representation of some tensile-spring strained benzene compounds and numbering of relevant atoms.
able only for 13 [64], where the benzene C(1)-C(2) bond is bridged by the spiropentane fragment. It is gratifying that the HF/6-31G* structural parameters [65] are in good agreement with the experimental estimates (Table 5). Variation in IBDs around the benzene perimeter follows the MN pattern. Both theory and experiment predict increases in ipso and meta bond lengths relative to free benzene CC bond taken as a reference. In contrast, ortho and pard bonds are slightly shortened. Anisotropy of bonds in several tensile spring
56 strained benzenes 13-16 is reflected in a difference in bond lengths 5io between ipso and ortho bonds. It increases as the angle of annelation a decreases, compound 15 being an exception (Table 6). This increase in bond fixation is reflected in the bond localization
Table 5 Selected bond distances of the benzene fragment strained by spiropentane 13 (in/~). Distances Differences 5b Bond HF/6-31G* X-ray a HF/6-31G* X-ray ~ 6
C(1)-C(2) 1.402 C(2)-C(3) 1.380 C(3)-C(4) 1.392 C(4)-C(5) 1.384 C(1)-C(7) 1.503 C(7)-C(9) 1.478 a Ref. [64]. b Differences 5 denote change in the
1.416 1.391 1.400 1.392 1.501 1.498
0.016 -0.006 0.006 -0.002
0.018 (-0.007) (0.002) (-0.006)
IBD relative to free benzene.
Table 6 Dependence of the ipso and ortho bond distances in some tensile-spring strained benzene compounds on the annelation angle a. Structural data and the bond localization indices refer to HF/6-31G* results (distances in ]k and angles in degrees) Compound Angle a d(ipso) d(ortho) 3dio Lb(d) 13 109.6 1.402 1.380 0.022 0.042 14 106.1 1.401 1.369 0.032 0.087 15 104.5 1.399 1.369 0.030 0.087 16 102.4 1.404 1.366 0.038 1.105
indices Lb(d). It should be pointed out that both 5dio and Lb(d) in 13 are appreciably higher than the corresponding values in tetralin 8 (Sdio = 0.002 /~ and Lb(d) - 0.008 A) and yet the angle ce is only slightly smaller than that in 8. Obviously, an additional feature is present in the tensile-spring aromatics. It is embodied in lengthening of the annelated bond by the external (poly)cyclic fragment, which introduces additional strain by the ring closure requirement (the tensile-spring influence). In other words, lengthening of
67 the ipso bond contributes to a partial relief of the strain of the molecular string. The ipso bond stretching is followed by the rehybridization effect and concomitant redistribution of the ~-electron density in a typical MN manner. Siegel et al. [55] suggested that a "bicyclic-strain" effect was something distinctly different from the original Mills-Nixon effect. This is, however, unjustified since the final double bond fixation pattern is in harmony with the MN picture and because the underlying mechanism remains essentially the same.
3.5. The Effect of the Double Bond and Lone Pair(s) It is intuitively clear that external double bond(s) and/or lone pair(s) will reinforce perturbation of the aromatic sextet. Some illustrative examples are depicted in Scheme 1. The MP2(fc)/6-31G* geometry of indene 17 exhibits very little bond length varia-
~.~397 1.401
~.46~.~ 0
~ 1.399
1.386
1.360
1.506 1.42(.
17
1.368
18
//•(3)
1.342(3)/I
1.488(~ 1"382(3~1.394(3)~ 1.367(4)
19 Scheme 1
tion within the benzene ring. The localization index Lb(d) = 0.025 is, however, three times larger than that in benzocyclopentene 8. Nevertheless, its value is still low being characteristic of very weak double bond fixation. A strong bond localization is found in benzocyclobutadiene 18, where both experiment [66] and theory [43,67] predict a pronounced alternation of bond distances. The selected IBD values presented in Scheme 1 are obtained by the MP2(fc)/6-31G* model. They give rise to a relatively high bond localization index Lb(d) - 0.156. The reason behind strong localization is the antiaromatic interaction accross the four-membered ring. It is relieved by the additional shift
68 (relative to benzocyclobutene 11) of the :r-density from the ipso to two ortho CC bonds - a feature which is also a characteristic of the electronic structure of [N]phenylenes (see later). We give here some structural data of trans-l,l'-bis(indenylindene) 19 (Scheme 1) obtained by X-ray measurements as an interesting oddity. Although bond distances do reveal moderate bond fixation in the benzene rings, the authors claim that fusion does not produce noticeable double bond localization [68]. The influence of the lone pair(s) on the bond fixation is studied in a series of heteroanalogs of cyclopropabenzene including benzocyclopropenyl anion (Fig. 14). Relevant
20
21
22
Figure 14. Schematic representation of heteroanalogs of cyclopropabenzene involving also benzocyclopropenyl anion.
structural parameters obtained by HF/6-31G* and MP2(fc)/6-31G* models [43,50] for 12, 20 and 21 are given in Table 7. Cyclopropabenzene is included too for the sake of comparison. Benzocyclopropenyl anion 22 is treated by the same theoretical models, but by using a more flexible 6-31+G* basis set as usual for anions [69]. A difference between HF and MP2 results will provide some information on the role of the electron correlation in determining the geometric features of fused systems. NBO s-characters and L6wdin 7r-bond orders are presented for interpretative purposes along with the CC bond distances. Firstly, it should be pointed out that heavy atoms in systems 20 and 22 are not planar as expected in view of nonequivalence of X-H bonds (X - N,C) and their lone pair counterparts placed in the sp n hybrids. The dihedral angle N-C(1)-C(2)C(3) in 20 is 163.7 ~ and 175.4 ~ by the MP2 and HF models, respectively. Preliminary MP3(fc)/6-31G* calculations give 170.2 ~ thus showing that a degree of puckering strongly depends on the electron correlation. This is an interesting finding since bond angles are generally well reproduced at the HF/6-31G* level. This holds for molecules presented in Table 7 too. Nonplanarity of C atoms in benzocyclopropenyl anion 22 is almost negligible as evidenced by the dihedral angle of ~ 4 ~ On the other hand, the anionic center itself is pyramidalized and the hydrogen is tilted 50.6 ~ out of the ring plane. The pyramidalization is determined by two factors: (a) a tendency of the lone pair to retain as much s-content as possible and (b) the lone pair-back bonding effect, which donates some of the lone pair density to the aromatic ring. It is, namely, very interesting that the molecular puckering takes place in such a way that the overlapping between the lone pair orbital and 7r-AOs of the fused ring is enhanced as if their interaction was increased. This effect is even stronger in 20 (vide infra). We conclude the discussion of the shape of
69 Table 7 Structural parameters of cyclopropabenzene 12 and some of its heteroanalogs (distances in/~ and angles in degrees) a,b Molecule Bond/Angle HF/6-31G* MP2(fc)/6-31G* 5(benz) s-character ~bo 12 C(1)-C(2) 1.332 1.352 25.7-25.7 0.62 C(2)-C(3) 1.370 1.382 -0.015 43.7-33.6 0.66 0.012 35.2-35.1 0.64 C(3)-C(4) 1.400 1.409 C(4)-C(5) 1.395 1.408 0.011 35.6-35.6 0.67 30.4-21.0 0.17 C(1)-C(7) 1.494 1.503 C(1)-C(2)-C(3) 124.7 124.6 C(2)-C(3)-C (4) 113.0 113.0 C (3)-C(4)-C(5) 122.3 122.5 C(1)-C(2)-C(7) 63.5 63.3 -
20
21
22
C(1)-C(2) C(2)-C(3) C(3)-C(4) C(4)-C(5) C(1)-N C(1)-C(2)-C(3) C (2)-C(3)-C(4) C(3)-C(4)-C(5) C(1)-C(2)-N
1.321 1.349 1.428 1.377 1.462 126.5 110.7 122.8 63.2
1.334 1.374 1.421 1.404 1.500 126.2 110.8 122.9 63.4
C(1)-C(2) C(2)-C(3) C(3)-C(4) C(4)-C(5) C(1)-O C(1)-C(2)-C(3) C(2)-C(3)-C(4) C(3)-C(4)-C(5) C(1)-C(2)-O
1.311 1.336 1.448 1.369 1.425 128.2 108.4 123.4 62.6
1.323 1.362 1.438 1.396 1.484 127.8 108.5 123.7 63.6
C(1)-C(2) C(2)-C(3) C(3)-C(4) C(4)-C(5) C(1)-C(7) C(1)-C(2)-C(3) C (2)-C(3)-C(4) C(3)-C (4)-C(5) C(1)-C(2)-C(7) a A difference 5 gives a change b HF and MP2 calculations on
-0.023 0.024 0.007
27.5-27.5 46.5-33.3 34.7-34.2 36.4-36.4 25.6-18.4
0.56 0.71 0.56 0.74 0.13
28.5-28.5 48.6-32.7 34.2-33.4 37.0-37.0 22.6-15.8
0.52 0.76 0.51 0.78 0.14
1.396 1.390 25.3-25.3 1.354 1.379 -0.018 43.9-32.9 1.455 1.446 0.049 35.1-34.9 1.360 1.389 -0.008 35.2-35.2 1.457 1.467 30.6-20.4 123.3 123.5 114.9 114.5 121.6 122.0 61.4 61.8 in CC distances relative to free benzene. 22 are carried out by using 6-31+G* basis set.
0.63 0.66 0.64 0.67 0.17
-
-
-0.035 0.041 -0.001 -
-
70 these molecules by noting that angular deformations in 20 and 21 are highly pronounced. The C(2)-C(3)-C(4) angle in the latter compound is 108.5 ~ thus being smaller than the tetrahedral value! It is also remarkable that the C(1)-C(2)-X (X = C(7), N and O) is practically constant being 63 ~ whereas the corresponding C(6)-C(1)-X angles are close to 180 ~ Concomitantly, the apical C(1)-X-C(2) angle of small rings is practically constant being very close to 53 ~ It a p p e a r s t h a t fusion of t h r e e - m e m b e r e d rings leads to e x t r a o r d i n a r y d e f o r m a t i o n s indeed. Lengths of the annelated bonds are peculiar too. They decrease along the series 12, 20, and 21. This is counterintuitive inspite of the fact that the average s-character slightly increases in the same direction. The puzzling detail is that the 7r-bond order of the fused bond in 21 is rather low ~0 0.5 and yet its bond length is the shortest. Apparently, an answer should be sought in the structure of small three-membered rings. Their geometries are presented in Scheme 2 as obtained by the MP2(fc)/6-31G* model (distances in A and angles in degrees). In addition to optimal
H
H
f l.034
1.032 N
N
/..)
H"
H
H
A
/
\ C
O
H
O
Ao.~o 1.512
/
~161,91_~277 ~4 H"
.....
X 1-535
&l. 284X1.081 H//
" "H
B
127.8
D Scheme 2
equilibrium structures A and B, model systems C and D are obtained by squeezing the H-C=C angle to 125.9 ~ and 127.8 ~ as found in fused molecules 20 and 21, are considered too. One observes that deformation imposed on the three-membered ring increases the double bond length by ~ 0.01 A. In both cases the double bond in oxyrene is shorter than that in azirene by 0.01 A a feature which is transfered to fused systems. The single bond distances are of particular interest since they are dramatically shortened in the annelated systems 20 and 21 to 1.500 A and 1.484 A respectively. This finding can be rationalized
71 by the substantial lone pair-back bonding effect. Heteroatoms transfer some of the lone pair electron density to the aromatic ring via the resonance mechanism illustrated by Fig. 15. The first two Kekul6 resonance structures are the most important ones, but
Figure 15. Resonance effect in heteroanalogs of cyclopropabenzene involving ~-back donation provided by lone pair(s), where X - NH and O.
they are not equivalent. The second structure is much less favourable because the 7rbond localization is antagonistic to rehybridization changes taking place in the molecular plane. Additionally, the interaction of the double bond fixed at the carbon junction atoms and lone pair(s) is destabilizing, because of the Coulomb repulsion and the antiaromatic distribution of coupled electron pairs. Hence, the first Kekul~ structure will have much larger weight and its distribution of ~-electrons will prevail. However, delocalization of lone pair(s) described by the remaining six resonance structures cannot be abandoned. In fact, it describes a partial double bond character of the essentially single bonds involving heteroatoms, which leads ultimately to their shrinkage as observed in molecules 20 and 21. Moreover, the bond shortening effect is more pronounced in the latter systems, since two lone pairs (rabbit ears) participate in delocalization. At the same time the lone pairelectron back donation resonance interaction contributes to very low ~-bond orders of the fused ipso bonds. Therefore, interpretation of the 7r-bond fixation in systems like 20 and 21 involving lone pair heteroatoms by antiaromaticity alone, offered by Siegel et al. [18], is an oversimplification. It goes without mentioning that the resonance mechanism described by Fig. 15 is operative in benzocyclopropene anion 22 too, but in its case X +
72 should be replaced by the neutral CH group, whereas X should be exchanged by the CHgroup. Perusal of CC bond distances and the accompanying 7r-bond orders presented in Table 7 reveals that an appreciable MN effect occurs in systems 20-22. This is evidenced by alternation of their IBDs values and differences 5(benzene) relative to free benzene gauge value. Bond fixation is reflected in the corresponding bond localization indices L(d). They read: 0.064(0.068), 0.099(0.167), 0.155(0.242) and 0.153(0.238) for 12, 20, 21, and 22, respectively, where the first number refers to the MP2 results whereas the Hartree-Fock model calculations are given within parentheses. It follows that oxa-heteroanalog 21 exhibits the largest MN effect. It is also apparent that the HF model exaggerates the bond fixation in the aromatic moiety, but its qualitative predictions are correct in most cases.
3.6. Amplification of the M i l l s - N i x o n Effect The MN-bond fixation can be enhanced (or curtailed) by judicious choice of right substituents placed at proper positions [70,71]. However, more dramatic localization effects are achieved by multiple annelation. Some representative trisannelated systems exhibiting MN-efect are depicted in Fig. 16. We shall focus only on the central benzene ring. It is easy to show that all its CCC internal angles are 120 ~ Anisotropy of the CC bond distances varies along the series as expected. Available IBD values for their ipso and ortho bonds obtained both by theory and experiment are summarized in Table 8. It appears that the scaled HF/3-21G procedure [46] predicts an anisotropy of ipso and ortho bonds of 5io = 0.018 ]k in 23. The experimental estimate is 0.0084-0.02/~ implying that it is well below the error limits. It is probable that the simple theoretical model employed overestimates bond localization in this compound, but a small MN effect would fit the overall picture developed by numerous quite reliable calculations. In contrast, ~io in tricyclobutabenzene 24 is 0.023 ]k this being larger than the experimental error. This value is in accordance with the theoretical estimate ~io = 0.017 .~. Hence, one can conclude that the bond fixation in 23 is very small being more of conceptual than of practical significance. On the other hand, bond alternation in 24 is small but not negligible. The largest bond anisotropy is found in triscyclobutadieno-benzene 28 where 5io assumes values as high as 0.19 ]k and 0.18 A as obtained by the scaled HF/3-21G and HF/6-31G* models, respectively. These findings are not corroborated by the bond alternation (or anisotropy) index Lb(d), because computational results for 23 and 24 are not obtained at the same theoretical level, whilst Lb(d) is not applicable on 25. It is fair to say that annelation of three cyclobutadiene moiety in 28 freezes almost completely the Kekul6 resonance structure which localizes double bonds at ortho positions. This is evidenced by very low 7r-bond orders along the ipso bonds, and structural parameters which are very close to those found in 3,4-dimethylenecyclobutene [32]. Triangulenes 25-27 exhibit remarkable contractions of the ortho CC bonds, which become shorter than the fused bonds belonging to three-membered rings. For instance, CC(ortho) in 25 is shorter than CC(ipso) by 0.01 ]k in contrast to the monoannelated benzocyclopropene 12, where CC(ortho) bond is longer than the fused bond by 0.03 ]k (Table 3). The reason behind a distribution of bond distances found in 25 is most likely a concerted shift of both s-content and 7r-bond density from ipso to ortho bonds [21] in a way balanced over the whole molecule. This
73
23
24
25
H N
O\
f J H 26
29
27
30
28
31
Figure 16. Schematic representation of some representative trisannelated benzene compounds exhibiting Mills-Nixon effect.
effect is appreciably enhanced in heteroanalogs 26 and 27 where strong lone pair-back bonding mechanism is highly effective as discussed earlier (Fig. 15). It should be stressed
74 Table 8 Anisotropy of alternating CC bonds of the benzene fragment in some trisannelated MillsNixon systems as mirrored by IBDs (in/~).* Compound Bond Distance (~io (~b 23 ipso 1.3934;(1.395• b 0.018;(0.008) -0.006;(-0.001) ortho 1.375;(1.387+0.01) -0.024;(-0.009) 24 ipso 1.406c; (1.413) d 0.017;(0.023) 0.009;(0.017) ortho 1.389;(1.390) (-0.008);(-0.006) 25 ipso 1.379e;(1.377) / ortho 1.369;(1.363) -0.018;(-0.019) 26 ipso 1.395 9;(1.392) h _ _ ortho 1.304;(1.334) -0.082;(-0.063) 27 ipso 1.3859;(1.395) h ortho 1.289:(1.316) -0.097;(-0.081) 28 ipso 1.522i;(1.500) d 0.191;(0.183) 0.123;(0.114) ortho 1.331;(1.317) -0.068;(-0.069) 29 ipso 1.416k;(1.417) ' 0.052;(0.038) 0.030;(0.021) ortho 1.344;(1.379) -0.022;(-0.017) 30 ipso 1.430m;(1.438) n 0.057;(0.089) 0.033;(0.042) ortho 1.373;(1.349) -0.024;(-0.047) 31 ipso 1.425 ~ 0.072 0.029 ortho 1.353 -0.043 *A difference between CC ipso and ortho bonds is given ~io, whereas a change relative to CC bond in free benzene is denoted by ~b. /Scaled HF/3-21G results of ref. [32]. aScaled HF/3-21G results of ref. [46]. JHF/6-21G* results of ref. [18]. bX-ray data of ref. [47]. kHF/6-31G* results of ref. [72]. CMP2/6-31" results of ref. [17]. dX-ray data of ref. [17]. tX-ray data of ref. [72]. mMP2/6-31G* results of ref. [73]. eMP2(fc)/6-31G* results of ref. [61]. nX-ray data of ref. [55]. YMP3(fc)/6-31G* results of ref. [61]. gHF/6-31G* results of ref. [43]. ~ data of ref. [74]. hMP2(fc)/6-31G* results of ref. [43].
that the oxygen heteroanalog 27 exhibits a dramatic shortening of ortho bonds which is as high as 0.081 /~ relative to benzene according to MP2(fc)/6-31G* results. This is compatible with very high 7r-bond order (0.076) in contrast to a low value associated with the annelated bond (0.35). A brief digression is of some importance here. The bond localization index Lb(d) cannot be applied in triangulenes 2 5 - 2 7 as a measure of the bond anisotropy. The reason is simple: the annelated bonds have to be compared to the double bond of the corresponding small three-membered ring, whereas ortho bonds have to be gauged by the CC bond in a free benzene. Concomitantly, a similarity of bond distances e.g. in 25 does not imply that the MN effect is negligible in this molecule. The extent of the MN localization is reflected in shortening of the ortho bond ~b relative to benzene, which in turn assumes values -0.02, -0.06 and -0.08 /~ along the series 25, 26 and 27,
75 respectively, as obtained by the MP2(fc)/6-31G* model. An additional interesting feature of all triangulenes is that the external C-C-X (X = C(7), N and O) bond angles are close to 180 ~ For instance the bond angle C(6)-C(1)-O in 27 is 181.6 ~ Anisotropy of CC bonds in trisannelated bicyclic systems 29-31 originates from the rehybridization at the carbon junction atoms and additional stretching of the ipso bonds by bicyclic molecular tensile springs. The bond anisotropy indices L(d) in 29, 30 and 31 based on the X-ray set of data are 0.114, 0.267 and 0.216, respectively. Hence compound 30 exhibits the largest bond fixation.
3.7. Extended 7r-Systems: [N]phenylenes Fused planar molecules involving juxtaposed 4n and 4n+2 7r-electron fragments represent a very interesting family of the extended 7~-systems with competing destabilizing and stabilizing cycles. They are named the [4n]annuleno[4n+2]annulenes. i particularly attractive subset of these compounds is given by n - 1 implying the presence of the cyclobutadiene (CB) and benzene moieties. The former introduces considerable a angular strain in addition to the unfavorable 47~-electron configuration. The interplay of the destabilizing angular strain and antiaromatic character on one side and the stabilizing aromaticity of benzene rings on the other may well lead to a range of new and unexpected properties which deserve attention. The archetypal molecule of this kind is provided by biphenylene, but sequential cyclobutabenzoannelation yields a variety of linear and bent [N]phenylenes, where N stands for the number of benzene fragments, whereas N-1 gives the number of four-membered rings. Synthetic routes to [N]phenylenes have been developed by Vollhardt et al. [75]. They pointed out that linear [N]phenylenes are good candidates for organic conductors, ferromagnets, etc. in view of the decreasing HOMOLUMO gap as N increases [76]. We commence discussion of theoretical results obtained recently [77] for biphenylene as progenitor of the [N]phenylene class of compounds. Its structural and electronic properties are crucial for an understanding of the larger systems. The structural parameters of biphenylene and the first higher linear and bent [3]phenylenes (Fig. 17) obtained by the HF/6-31G* model are given in Table 9. The most conspicuous feature of biphenylene (a) is alternation of bond distances around the aromatic perimeter. Let's focus on the rehybridization effect first. Since small rings prefer hybrids with the increased p-character, there is a substantial shift of the s-character to four CC bonds emanating from the cyclobutadiene fragment. Concomitantly, these bonds should be shorter than the remaining bonds of benzene rings. It is easy to see that the 7r-electron localization acts in the same direction as a result of a pure electronic effect. This is obvious from inspection of three relevant Kekul6 structures depicted in Fig 18. It appears that K2 and K3 taken together lead to a uniform shortening of CC bonds in benzene rings. The K1 structure is energetically the most favourable one since it maximally avoids the antiaromatic interaction within the cyclobutadiene moiety. Additionally it stabilizes four ortho CC bonds in harmony with the rehybridization effect. Synergism of a-and 7r-electrons yields a substantial shrinkage of ortho bonds which are shorter by 0.01 A than the distal C(2)-C(3) and C(6)-C(7) bonds. This intuitive interpretation is supported by distribution of the NBO s-characters and 7r-bond orders (Table 9). A shift of the ~-electron density from annelated to ortho bonds is characteristic for the whole family of linear and angular [i]phenylenes [78]. Furthermore, a low 7r-bond order (0.21)
76 Table 9 Biphenylene and [3]phenylenes. Structural parameters, s-characters (in %) and LSwdin 7r-bond orders as calculated by the HF/6-31G* model (distances in .~ and angles in degrees). Bond/Angle HF/6-31G* Exptl. s-character 7r-bond order biphenylene C(1)-C(2) 1.417 1.423 a 34.2-34.0 0.56 C(2)-C(3) 1.373 1.385 36.3-36.3 0.74 C(4)-C(4a) 1.357 1.372 35.4-40.3 0.72 C(4a)-C(4b) 1.507 1.514 30.2-30.2 0.21 C(4a)-C(8b) 1.414 1.426 29.3-29.3 0.52 C(1)-C(2)-C(3) 121.9 121.6 C(3)-C(4)-C(4a) 115.7 115.5 C(4)-C (4a)-C (4b) 122.4 122.4 linear [3]phenylene C(1)-C(2) 1.424 1.436 b 33.8-33.7 0.53 C(2)-C(3) 1.368 1.397 36.4-36.4 0.76 C(4)-C(4a) 1.352 1.359 35.5-40.6 0.74 C(4a)-C(4b) 1.508 1.512 30.0-30.0 0.16 C(4b)-C(5) 1.383 1.385 34.3-29.1 0.64 C(4a)-C(IOb) 1.417 1.397 29.0-29.0 0.50 C(4b)-C(lOa) 1.402 1.407 30.6-30.6 0.60 C(1)-C(2)-C(3) 121.8 120.1 C(1)-C(10b)-C(4a) 122.5 121.7 C(2)-C(1)-C(10b) 115.7 118.3 C(4b)-C(5)-C(5a) 112.0 112.0 angular [3]phenylene C(1)-C(2) 1.409 1.400 b 34.2-34.4 0.59 C(2)-C(3) 1.379 1.370 35.9-35.9 0.71 C(1)-C(IOb) 1.363 1.368 39.8-35.1 0.70 C(4)-C(4a) 1.363 1.365 35.1-40.0 0.70 C(4a)-C(10b) 1.410 1.413 29.4-29.6 0.54 C(4a)-C(4b) 1.498 1.503 30.1-31.6 0.23 C(10b)-C(10a) 1.502 1.505 30.3-30.5 0.23 C(4b)-C(4c) 1.335 1.345 40.7-40.7 0.77 C(4b)-C(10a) 1.449 1.449 28.1-27.5 0.41 C(10a)-C(10) 1.345 1.348 41.3-36.5 0.79 C(9)-C(10) 1.451 1.446 33.0-33.0 0.45 C(1)-C(2)-C(3) 121.8 122.4 C(2)-C(1)-C(10b) 115.7 115.3 C(10a)-C(10)-C(9) 117.5 117.7 aX-ray data of J.K. Fawcett and J. Trotter, Acta Crystalogr., 20 (1966) 87. bX-ray data of ref. [76].
77
8
1
6
3
5
4 (a)
9
8
9a 9b
7
b
6
10
1 Oa 3
b
5
4
(b) 9
6
10
5
4
3
(c)
Figure 17. Numbering of atoms in biphenylene (a) as well as in linear and angular [3]phenylenes denoted by (b)and (c), respectively.
of the bridge C(4a)-C(4b) bond indicates a small but nonzero delocalization between the two benzene fragments, which is important for understanding structural features of the four-membered ring. The fused and bridge bonds have similar average s-characters, but considerably different bond lengths (1.414 vs. 1.507 ~). The latter are in accordance with the corresponding 7r-bond orders, which assume values 0.52 and 0.21, respectively. Interestingly, the bridge bond is significantly shorter than in a free cyclobutadiene, where perfectly localized double bonds are found (1.507 vs. 1.565/~, respectively, at the HF/631G* level). Hence, its antiaromatic character is considerably reduced in biphenylene, which is evident also from the decreased 7r-bond order of annelated bonds (0.52). As a final comment we would like to draw attention to angular deformation of benzene rings caused by fusion of CB fragment. For example, the apical ortho-meta angle C(2)-C(1)C(Sb) is as low as 115.5 ~ indicating a certain spill-over of the angular strain.
78
K1
K2
K3
Figure 18. Relevant Kekul~ resonance structures in biphenylene.
Linear [3]phenylene is interesting because it has two types of benzene fragments. The central benzene unit exhibits a highly pronounced delocalization as revealed by a rather uniform distribution of 7r-bond orders, which are all close to the free benzene value of 0.67. This is further supported by the bond localization index which assumes a very low value (0.05). In contrast, peripheral rings are more localized than in the parent biphenylene as evidenced by the Lb(dcc) values (0.19 vs. 0.14). It is interesting to observe that the 7rbond order for the C(10a)-C(10) bond is the same as the average value for the C(1)-C(2) and C(1)-C(8b) bonds. The same holds for bond distances. Finally, the central benzene ring has some Baeyer angular strain, because the apical C(10a)-C(10)-C(9b) angle is only 112 ~ A decrease from the ideal 120 ~ value is twice as large as in biphenylene revealing additivity of the angular strain spill-over effect. The picture of the angular [3]phenylene offered by both HF/6-31G* model and the Xray structure is just the opposite to that in its linear counterpart: the central benzene ring is more localized than in biphenylene, whereas the terminal ones are slightly more delocalized. This is reflected in their Lb(dc~) indices which are 0.32 and 0.12, respectively. It should be mentioned that the double bond C(4b)-C(4c) bridging two four-membered fragments is the shortest and most localized one, which is compatible with its high average s-character of 41%. We would like to mention briefly that the largest effect is found in a propeller-like triangular [4]phenylene involving three-coalesced biphenylenes 32 (Scheme 3). The bond distances of the central ring (in A) are obtained by the MP2(fc)/6-31G* model [79], which compare reasonably well with the experimental X-ray data [80] given within parentheses. The anisotropy difference 6~o = 0.124(0.185) .& illustrates a dramatic effect of triannelation on the structural features of the common benzene moiety. The corresponding AM1 bond distances are 1.326 .& and 1.526 A yielding 6io = 0.2 A, which is somewhat exaggerated [81]. However, analysis of the 7r-bond orders does indicate that the degree of bond localization in this molecule is surpassed only by that in 28 and 3,4-dimethylencyclobutane. It appears that the Kekul6 structure of the central ring is almost completely frozen. In concluding remarks we would like to emphasize that the agreement between the theoretical and experimental geometries is reasonably good (Table 9). The structural features of higher phenylenes are similar to those of linear and bent [3]phenylenes [78]. The HOMO-LUMO gap rapidly decreases as N increases in agreement with experimental
79
1:356~ - ~
32 Scheme 3
observations. This splitting is higher in angular than in linear phenylenes for the same N. In contrast, bent phenylenes are more stable than their linear counterparts [78] inspite of a more pronounced bond localization. A credible explanation is given by decreased antiaromaticity of the CB moieties in the former subset of compounds. Finally, variation in structural parameters within benzene rings is indicative of the presence of a substantial MN effect, which is a result of concerted and cooperative action of a - and 7r-electrons. The largest effect is found in the molecular propeller compound 32. 4. R e v e r s e d M i l l s - N i x o n
Effect
There are molecular systems exhibiting ~-bond fixation patterns that are entirely opposite to that induced by the Mills-Nixon effect [82,83,67]. Typical examples of this kind are provided by benzoborirene 33 and benzocyclopropenyl cation 34 (Fig. 19) These compounds represent extended ~-systems relative to benzene itself since they encompass now "empty" 7r-orbitals at B and C + atoms, respectively. The structural parameters offered by HF/6-31G* [82] and MP2(fc)/6-31G* [43] models are given in Table 10. Both molecules are planar. A salient feature of the aromatic CC bonds is their stretching relative to benzene at ortho and para positions. In contrast, meta bonds are more localized and shortened. Another striking property is a pronounced delocalization within the three-membered ring (aromatic pattern involving 2~r electrons) as easily visualized by the resonance structures shown in Scheme 4. The same resonance mechanism is operative in benzocyclopropenyl cation. The ionic charge transfer structures involve the double bond coupling at meta positions occuring three times in contrast to ortho and para bonds, where ~r-electron perfect spin
80
6
6
BT-H 4
~
CT--H 4 ~,'" 3
3
33
2
34
H 1
BT__H
BT__H
122 ~
H~ 2
H 35
35a
H 1
C7__H
C3__-H
124.7 ~ H/2
H 36
36a
Figure 19. Schematic representation and numbering of atoms in reversed MN-systems benzoborirene 33 and benzocyclopropenyl cation 34.
pairing takes place only twice. There is no 7r-electron spin coupling along the ipso bond. In contrast, C(1)-B and C(2)-B bonds have the double bond character in three resonance structures, which in turn represents an increase relative to two such structures in a free three-membered ring 35. It follows that resonance interaction depicted in Scheme 4 gives a good qualitative interpretation of the bond fixation in reversed MN-systems 33 and 34. It is consistent with evidence provided by the Lbwdin n-bond orders analysis presented in Table 10. Concomitantly, meta bonds are condensed whereas para bonds are substantially stretched relative to benzene. It is noteworthy that angular deformations of the benzene fragment are similar to those found in MN-systems. This is not unexpected because angular distortions are dictated by rehybridization. In this connection it is worth mentioning that C(3)-C(2)-B and C(3)-C(2)-C(7) angles in 33 and 34 are 175.6 ~ and
81 Table 10 Structural parameters of benzoborirene and benzocyclopropenyl cation as obtained by the HF and MP2 models (distances in A and angles in degrees) a Bond/Angle HF/6-31G* MP2(fc)/6-a1G* s-character 7~bo benzoborirene 33 0.54 C(1)-C(2) 1.374 1.411 22.0-22.0 0.51 C(2)-C(3) 1.414 1.405 40.0-33.6 0.69 C(3)-C(4) 1.360 1.388 35.4-36.0 0.54 C(4)-C(5) 1.434 1.423 34.5-34.5 0.48 C(1)-B 1.472 1.483 37.9-28.4 C(1)-C(2)-C(3) 122.2 122.0 C(2)-C(3)-C(4) 115.6 115.8 c(3)-C(4)-c(5) 122.2 122.2 benzocyclopropenyl cation 34 C(1)-C(2) 1.372 1.412 18.9-18.9 0.41 C(2)-C(3) 1.415 1.401 45.5-30.7 0.51 C(3)-C(4) 1.357 1.390 35.2-35.7 0.71 C(4)-C(5) 1.465 1.443 33.9-33.9 0.48 C( 1)-C (7) 1.351 1.374 35.4-29.6 0.58 C(1)-C(2)-C(3) 124.8 124.6 C(2)-C(3)-C (4) 115.6 111.3 C(3)-C (4)-C(5) 124.1 124.1 35 C(1)-C(2) 1.340 1.360 29.2-29.2 0.74 C(1)-B 1.465 1.475 33.8-27.6 0.52 C(1)-C(2)-B 62.8 62.5 H-C(1)-C(2) 138.7 139.0 35a C(1)-C(2) 1.356 1.380 23.3-23.3 0.74 C(1)-B 1.461 1.471 40.0-28.3 0.53 C(1)-C(2)-B 62.4 62.0 36 C-C 1.349 1.369 28.3-28.3 0.62 C-C-C 6O 60 36a C(1)-C(2) 1.369 1.392 19.0-19.0 0.60 C(1)-C(7) 1.345 1.364 36.3-28.9 0.62 C(1)-C(2)-C(7) 59.4 59.3 a s-characters and 7~-bond orders correspond to MP2(fc)/6-31G* wavefunctions.
177.4 ~ respectively. Another point of interest is the lengthening of the C(1)-C(2) bond in both small rings 35a and 36a upon H-C-C deformation, which mimicks annelation. This distance increases by 0.02/~ in both molecular models. Additional stretching, which takes place in true molecules 33 and 34, is a consequence of the increased delocalization over
82
:BH
~
BH
B-H ~
:BH
~
~
:B H
:B H
~~~I~
B'H
:BH
Scheme 4
the whole systems as illustrated by Scheme 4. It is noteworthy that the hybrid orbital placed at the carbon junction atom and emanating toward the B atom has somewhat increased s-character in accordance with Walsh-Bent rule. The average s-character of ortho bonds is rather high (36.8 %) and yet they are stretched relative to benzene. Apparently, the 7r-electron density redistribution prevails in anti-MN systems. This finding is in full accordance with our analysis of bond length alternation in triphenylene, where it was found that a reversed MN-pattern occurred due entirely to the 7r-electron mode of interaction (Section 3.2). It is also interesting that 7r-bond orders of the three-membered ring in 33 are practically equal being -~ 0.5. Bonding parameters in 34 follow the same pattern with some notable quantitative differences. For example, the s-character in the fused bond drops to 19% only. The 7rbond order of the C(1)-C(7) bond is higher than that in the ipso bond indicating increased resonance interaction between the aromatic moiety and the cationic center. Some perfluorinated fused systems like 37 and 38 (Fig. 20) exhibit anti-MN mode of 7r-electron localization [21,67,83]. This conclusion can be easily drawn at least in a qualitative sense by resonance structures depicted in Scheme 5 representing a well known no-bond double-bond (hyperconjugative) delocalization, which should be distinguished from the negative hyperconjugation [85,86]. The latter is confined to the CF2 group exclusively leading to short C-F bonds and acute F - C - F angle of only 106~ ~ [21,67, 83]. It is important to notice that the last two resonance structures shown in Scheme
83
F 2 ~
/CF2
_ ~ C F 2 F2C"
F2C~d2 37
38
I-i Sl=
I-I Sii"2. .siH
H2Si"
H2SiQ, s(h2 39
/SiH2
40
Figure 20. Schematic representation of some triannelated reversed Mills-Nixon systems.
5 concentrate the x-density at the ipso CC bonds of the remaining two small rings. Taking into account all resonance structures of this kind one concludes that: (a) the ~-electron density is shifted from the ortho to ipso bonds and (b) there is a sort of aromatic delocalization over the three-membered rings, since C(7) apical carbons act as pseudo cationic centers in view of their rather high formal positive charge. These qualitative conjectures are corroborated by the x-bond orders, which serve here as a useful quantitative diagnostic tool (Table 11). For example, ~-bond orders of ipso and ortho bonds in 37 are 0.76 and 0.54, respectively, thus being just antipodal to the corresponding values in tricyclopropabenzene 25 where they read 0.61 and 0.70 [21]. Concomitantly, ortho bonds in 37 are longer by 0.027 .~ than their counterparts in 25. By the same token ortho bonds in perfluorotricyclobutabenzene [38] are longer than those in the parent hydrocarbon 24 by 0.02 ~ according to HF/6-31G* calculations [67]. It follows that rehybridization on one side and ~-electron resonance (in extended x-systems such as 33 and 34 and/or 7~-density redistribution triggered by hyperconjugation in perfluoro compounds 37 and 38) on the other side, act in antagonistic manner. It appears also that resonance and hyperconjugation dominate in these systems yielding a reversed MN pattern of bond fixation in the benzene ring. A similar mechanism is operative in silacyclopropabenzenes and silacyclobutabenzenes 39 and 40, where Si atoms act as pseudo cationic centers. It appears that ortho bonds in both compounds are longer than that in free benzene - the amount of stretching being 0.025 A (MP3(fc)/6-31G* model) and 0.013 _~ (MP2(fc)/6-31G* model)in 39 and 40,
84
F-+!
F
F-+r
F .,%
CF2
--
~
CF2
F2
F-
c
F2
FI
/'
FI
F-
/'
c.
--
F2
=
etc.
F2 Scheme 5
Table 11 Anisotropy of CC bonds of the central benzene ring in reversed MN-systems 37-40 (in
A) Molecule 37
Bond ipso
Distance 1.332 a (1.381) b ortho 1.408 (1.397) 38 ipso 1.370 c ortho 1.397 a HF/6-31G* results of ref. [83]. b MP2/6-31G* results of ref. [21]. c HF/6-31G* results of ref. [67].
7~bo 0.76 0.54 0.70 0.61
Molecule 39
Bond ipso
Distance 7rbo 1.406 d 0.73 (1.385) e ortho 1.405 0.47 (1.420) 40 ipso 1.426 f 0.65 ortho 1.410 0.58 d MP2(fc)/6-31G* results of ref. [61]. e MP3(fc)/6-31G* results of ref. [61]. / MP2(fc)/6-31G* results of ref. [84].
respectively [61,84]. This is in harmony with 7r-bond orders given in Table 11. An interesting blow-up phenomenon is found in compound 40 meaning that the average CC distance of the aromatic nucleus is larger than that in the free benzene. This effect is interpreted by the :r-electron charge transfer interaction between the highest occupied MOs of the benzene ring and the r antibonding fragment orbitals of the Sill2 groups [84]. Interesting reversed MN-systems are provided by benzocyclobutadiene complexed by the Fe(CO)3 fragments. A mild anti-MN distortion is observed in the tricarbonyliron com-
85 plex with benzocyclobutadiene where the Fe atom is placed above the four-membered ring [87]. Similarly, a modest reversed Mills-Nixon u-localization was observed in triannelated trans-[Re(CO3)]3-triindenyl system [88] as evidenced by the average X-ray distances 1.440 A and 1.451 ~ related to ipso and ortho bonds, respectively. More experimental and theoretical research is desirable in this direction. In conclusion, we would like to offer the following tentative definition of the reversed MiUs-Nixon systems as compounds possessing perturbed central aromatic moieties, where ~ - and 7r-electrons act in an antagonistic manner. The 7r-electron localization pattern is governed by the strong conjugative or hyperconjugative interaction with annelated rings. It prevails yielding a reversed bond fixation picture compared to ordinary MN-systems. It is useful to keep in mind that the conjugation//hyperconjugation modes of interaction involves partial aromatization of fused rings as a rule. 5. Chemical Consequences of the M i l l s - N i x o n Effect
5.1. Electrophilic Substitution Reactivity Much of the electrophilic reactivity of aromatics is described in great detail in a comprehensive recent book of Taylor [10]. We shall focus attention on the electrophilic substitution reactivity of annelated benzenes and try to interpret the orientational ability of fused small rings. For this purpose we consider here Wheland a-complexes [89], which seem to represent very well transition states of the electrophilic substitution reactions. It is also convenient to take the proton as a model of the electrophilic reagent. In order to delineate rehybridization and x-electron localization effects, let us consider a series of angularly deformed benzenes (Fig. 21), where two vicinal CH bonds bent toward each other mimick a fused small ring. Angles e of 110 ~ and 94 ~ simulate five and four membered
6 4 3 41
6
2 ~H
4
2 H
H 42
Figure 21. Relevant resonance structures in deformed benzene 41 and benzenium ion 42 and numbering of atoms.
carbocycles, respectively. This model system describes angular strain effect, which is free of "contamination" introduced by external conjugation or hyperconjugation taking place
86 in true annelated small rings. The dominant MN mode of localization is illustrated by 41. The protonated benzene is also planar apart from the sp 3 hybridized center produced by the protonation. The structure of the benzenium ion 42 is well known and the 7r-electron localization is best described by the predominating resonance structure 42 [11]. It is quite clear that the two modes of the 7r-electron localization- one induced by the angular deformation (ground state effect) and the other triggered by protonation (transition state effect simulated by the Wheland intermediate) - cannot achieve perfect matching if both events are present at the same time. In addition, c~- and fl-protonation will exhibit different degrees of matching or mismatching which leads to regioselectivity in the electrophilic substitution reactions. Moreover, comparison of the resonance structures for the c~- and fl-proton attack reveals that they essentially differ in structures 43(e) and 44(e) depicted in Fig. 22. It appears that the 43(e) resonance structure is antagonistic to the
6 .1
,
H
H
6
H
H
43(e)
44(e)
Figure 22. Representative resonance structures of c~ and fl protonated deformed benzenes.
localization pattern induced by the angular strain, as described by 41, in the most critical domain, namely, around the fused bond. In contrast, fl-protonation has inter alia the resonance structure 44(c), which is more compatible with the ~-bond fixation produced by a-rehybridization effect. It follows that ~-position is more susceptible toward the electrophilic attack. Actual ab initio results involving HF/6-31G* geometry optimization constrained by fixed e and supplemented by the single point MP2(fc)/6-31G*//HF/631G* calculations show that conjecture derived by qualitative terms is correct. There is a linear relation describing the difference in energy E~ - EZ as a function of the deformation angle e, where E~ and EZ denote total electronic energies of the corresponding protonated species. The quality of the correlation obtained by the MP2(fc)/6-31G*//HF/6-31G* model:
E~ - EZ - 1 . 9 9 . 1 0 1 - 1.66.10-1e
in kcal/mol
(6)
is very good as evidenced by the correlation coefficient R = 0.9997. The slope of the straight line is negative indicating that selectivity increases as the angle of annelation
87 of a small ring decreases. It should be mentioned that benzene itself (with e = 120 ~ represents to some extent tetralin, if the hyperconjugative effect in the latter molecule is neglected in the first approximation. A similar relationship is obtained for the HF/6-31G* model, although the line passing through points ~ = 94 ~ 111 ~ and 120 ~ is slightly curved (Fig. 23).
m
o
8
7 6 i,i
5 4
3 2 I
-1
-'''ll,llJllltllllitl 0 95 100
105
110
IJllllllllllt 115 120 125 annelation angle
Figure 23. Linear dependence of the difference in total energies between the a - and /% Wheland intermediates on the angle e of the annelated small ring simulated by the H - C = C deformation in benzene.
Ea - E~ - 5.75.101 - 7.5646.10-1e -k- 2.3127.10-2J
in kcal/mol
(7)
Apparently, the HF/6-31G* model exaggerates regioselectivity of the/3-site. This is not unexpected since the Hartree-Fock overemphasizes the bond fixation in annelated ben-
88
zenes. It is very important to mention that the orientational property of small rings does not explicitly depend on the ground state angular strain. This can be shown in a very elegant way by employing the concept of homodesmic reactions [91,92]. They enable dissection of the ground state and transition state (TS) effects pertaining to their specific modes of the 7r-electron bond localization:
~
+ H H
H
H
HH.~ ~
+
H
e + Eif(e)o~ H
(8)
H
~
+
H
+ Eif(e)l]
, H
H
(9)
The interference energies Ei/(e)~ and Ei/(~)Z measure the degree of (in)compatibility of 7r-bond fixation induced by the angular strain and protonation at a - and fl-positions, respectively. Their values are summarized in Table 12. It follows straightforwardly that Table 12 Decomposition of the difference in energy E~ - EZ into interference energies describing counteraction and synaction of the two localization patterns occuring upon protonation (in kcal/mol). Deformation angle 111 ~ 94 ~
Substitution a /3 a /3
Interference Energy H F / 6 - 31G* M P 2 ( f c ) / / H F / 6 1.2 0.9 -0.8 -0.5 4.4 3.5 -2.4 -0.8
31G*
E~ - EZ HF MP2 2.0 (1.4) 6.8 (4.3)
E a ( s ) - Efl(s -- Eif(s - Eif(s i m p l y i n g that regioselectivity depends on the difference in the degree of (mis)matching of the GS and TS modes of the 7r-bond localization. Therefore, the GS strain energy disappears by subtraction of eqn. (8) and (9), but it is important to realize that the GS structural and electronic features enter the interference energies Eif(e)~ and Eif(e)z in an implicit way. It appears that a-positions are deactivated by the angular strain. Their degree of deactivation is an inverse function of the annelation angle c. The /%position is slightly activated being practically constant as a function of e according to the MP2(fc)/6-31G*//HF/6-31G* model. It follows that the enhanced regioselectivity in annelated systems induced by small rings is a consequence of the increased deactivation of the a-position. It is of interest to examine the regioselectivity of benzocyclobutadiene 18. The annelation angle e = 88.3 ~ [92] and the electrophilic
89 reactivity discrimination energy difference E ~ - E~ assumes a value as high as 15 kcal/mol as obtained by the MP2(fc)/6-a1G*//HF/6-alG* model. Employing relation (6) one obtains an estimate of E ~ - E~ = 5.2 kcal/mol, which can be attributed solely to the angular deformation of the four-membered ring. A difference of 9.8 kcal/mol arises due to the influence of the external double bond. This is conceptually quite clear since the incompatibility illustrated by the reference structure 43 (Fig. 22) substantially increases by the additional localized double bond and an antiaromatic interaction across the small ring. Very strong orientational ability of benzocyclobutadiene is again a consequence of the increased deactivation of the a-position. The situation in true compounds 8, 11 and 12 is similar albeit more complex because of the hyperconjugation [12]. As an illustrative example let us consider benzocyclobutene 11. The corresponding homodesmic reaction for the a-protonation reads:
+ H
[ ~
=
+
H
H
~ ) ]
+ Eif(4)(z
H
(10)
where the number within parentheses denotes the small ring. Eqn. (10) gives the energy of interference originating from two events: (a) fusion of the cyclobutene ring and (b) protonation at the a-site. It is useful to observe that E~/(4)~ is given by the negative change in the proton affinity of benzene upon fusion of the small ring i.e. E~f(4)~ = -PAinc(4)a, where the subscript inc stands for the increment defined as PAinr = PA(4)~ - PA(benzene). It is easy to see that eqn. (10) can be extended as follows:
+
. , H H
+ 2(CH3-CH3)- (CH3-CH2-CH2-CH3) H H
-
CH3
iii
~CH3, [~CcHHI = Eif(4)o~ + I HZ.H "CH3 I _[ ~H H _ [~1
(11)
90 Terms within square brackets can be interpreted in the following way. The first term can be identified with the strain energy occurring in the transition structure (TS), if the latter is well described by the Wheland a-complex. Consequently, it will be denoted by E~(TS)4~. This entity involves the angular strain energy of the four-membered ring and the interference energy of the two modes of the 7~-electron bond fixation as discussed earlier [eqn. (8)]. It is interesting to mention that the influence of the two CH2 groups in benzocyclobutene is approximately cancelled by two CH3 groups in the protonated o-xylene. The second term represents the conventional GS angular strain energy E~(GS)4 of the parent benzocyclobutene. Finally, the last term is the negative increment of the PA of ortho-xylene: -PAinc(o-xylene)~, where protonation takes place at the c~ position. It recovers the influence of the CH2 groups in 11 on the c~ site proton affinity in the approximation of two methyl groups in o-xylene. This term is therefore denoted by -PAi~(o-xylene)~. Hence, eqn. (11) can be written in the following form: Ei/(4)~ : - P A i ~ ( 4 ) ~ : E~(TS)4~ - Es(GS)4
--
PAin~(o- xylene)~
(12)
The same expression mutatis mutandis holds for the ~-protonation of benzocyclobutene:
E~](4) z = - P A ~ ( 4 ) z
=
Es(TS)4Z
-
Es(GS)4
-
P A i n ~ ( o - xylene)z
(13)
A difference in the total energy of a and /~ protonated benzocycloalkenes can be now expressed in the following form: E(n)~ - E(n)z = Es(TS)n~ - E~(TS)nz + [PAi~c(o- xylene) z - P A i n c ( o - xylene)~](14)
Here n : 3 , 4 and 5 denotes three-, four- and five-membered rings. It should be noticed that the ground state strain energy Es(GS)~ disappeared in the final expression again. Since the activation energy of a particular position within an aromatic fragment can be approximately described as the negative proton affinity, which in turn is defined as the positive entity, it follows that the last term in (14) yields discrimination of the ~- vs. c~- site in benzocycloalkenes due to carbocyclic methylene groups. This is a small and positive contribution to the difference in energy E ( n ) ~ - E(n)z. It is given essentially by a difference between the para and ortho PA values of toluene in view of the additivity of the substituent effects in multiply substituted aromatics [93,94]. The difference E ( n ) ~ - E(n)z is a linear function of the annelation angle e: HF/6-
31G*
E(n)~ - E(n)z = 11.15- 6.98en
in kcal/mol
(15)
and M P 2 ( f c ) / / H F / 6 - 31G*
E(n)~ - E(n)z : 7 . 6 8 - 4.61cn
in kcal/mol
(16)
with the correlation coefficients R = 0.9996 and 0.994 for eqns. (15) and (16), respectively. They are shown in Fig. 24. It is gratifying that points corresponding to o-xylene lie off the straight lines as expected, because the MN-effect is absent in this molecule. On the
91
0
F i
0 (..)
-~,,,.~
I,I
HF/6-31G"
7-
I
15
I,I
-12
-
MP 2 ( f c ) / 6 - 3 1 G ' / / H F / 6 - 3 1
G"
o-xylene
0
-,,,
I,i
70
, ,1
80
,,
, Jl
90
,,
i ,I
,,
100
t , 1 , , , , I , ,
110 120 annelation angle
Figure 24. Linear dependence of the difference in total energies between the a - and flWheland intermediates on the angle e of the annelated small ring related to the carbon junction atom. Benzocyclopentene, benzocyclobutene and benzocyclopropene are denoted as 8, 11 and 12, respectively.
other hand the relative stability of fl-Wheland intermediates increases with the extent of the MN-effect as the size of fused ring decreases. It is of some interest to analyze various contributions to the difference E(n)~-E(n)#. They are given in Table 13. where several points deserve attention. The GS angular strain sharply increases along the series 8, 11 and 12. It is interesting to compare Es(GS)n values of 11 and 12 with the corresponding strain energies of small rings. By using the same theoretical models one obtains that Es(GS) for cyclobutene is 32.4(32.9) kcal/mol where the MP2(fc)/6-31G*//HF/6-31G* estimate is given within parentheses. Fusion to the benzene ring changes the GS strain
92 Table 13 Resolution of the difference in protonation energy E(n),-E(n)z into various modes of intramolecular interactions as obtained by HF/6-31G* and MP2(fc)/6-31G*//HF/6-31G* models a Molecule Es(GS)n Es(TS)nc, Es(TS)n~ E(n),:,-E(n)~ .......
8
2.8 (4.9)
2.6 (5.1)
1.1 (3.2)
3.3 (2.6)
11 31.1 (33.3) 34.3 (37.0) 31.5 (34.4) 4.6 (3.7) 12 73.4 (72.0) 82.0 (82.4) 77.1 (78.2) 6.7 (4.9) In kcal/mol. Results offered by the MP2 model are given within parentheses.
energy by -1.3 (3.2) kcal/mol. Protonation at a - and /3-positions leads to additional TS increase in the destabilization energy by 3.2 (3.7) and 0.4 (1.1) kcal/mol, respectively, presumably due to competition of two opposing modes of 7r-localization. In 12 these changes are more pronounced. The angular strain of cyclopropene is 58.7 (57.7) kcal/mol by the HF/6-31G*(MP2(fc)/6-31G*//HF/6-31G*) models. Increase in the GS strain energy upon fusion is 14.7 (14.3) kcal/mol. Finally, protonation at a - and/~-sites increases the transition structure strain energy Es(TS) by 8.6 (10.4) and 3.7 (6.2) kcal/mol, respectively. Obviously, the bond localization in benzocyclopropene 12 is more pronounced and mismatching of bond localization patterns induced by the MN effect and protonation is larger. However, Es(TS)nB is always lower then Es(TS)n~ thus leading to the more stable /3-Wheland intermediates. To reiterate, the GS strain energy does not directly determine the regioselectivity in the electrophilic reactions. Nevertheless, its influence cannot be neglected since it determines a degree of incompatibility of 7r-bonding patterns in a and /3-protonated a complexes. In other words, a molecule in its transition structure, modelled by the corresponding intermediate, remembers its ground state geometry and the accompanying 7r-electron localization ("memory effect"), which is better adapted for /%protonation in Mills-Nixon systems. It should be mentioned that Streitwieser et al. [95] advanced another argument in rationalizing enhanced acidity at the a-position in benzocyclobutene. They argue that the hybrid AO emanating from the carbon junction atom toward the a site has a high s-character thus possessing amplified electronegativity. Concomitantly, the a-carbon atom is deprived of some of its electron density thus being activated in the proton releasing process. Conversely, the a-site should be deactivated in the electrophilic reactions. Apparently, the electron density at a and fl positions in annelated benzenes is of great interest. Therefore, we examined atomic and 7r-densities in the model system 41. Lhwdin effective 7r-densities obtained by the MP2(fc)/6-31G*//HF/6-31G* model are the same for ipso, ortho (a) and meta (13) C atoms being constant for the whole range of changes of the annelation angle e between 111 ~ and 88 ~ Total atomic densities obtained by Lhwdin population analysis for the carbon junction (ipso) atom change very slowly from 6.171e ] to 6.201e I as e decreases from 111 ~ to 88 ~ On the other hand the total number of electrons placed on a - and fl-atoms is virtually constant being 6.16 and 6.17, respectively. The situation is similar in benzocyclobutene 11 and 12. Employing the same theoretical procedure as above, one obtains r-densities in 11 that are close to 1.01e] being 0.96, 0.99 and 0.991e I pertaining to ipso, a - and/3-carbons, respectively. In benzocyclopropene the
93 corresponding values read 0.94, 0.99 and 0.98 implying that there is no difference for the critical a - and/3-atoms. The formal atomic charge for a - and/3-carbons in 11 is -0.17 implying that there is no difference between them at all. In contrast, a-carbons have a slightly lower density than fl-ones in 12(6.15 vs. 6.17]e[), but this is of little practical significance. Hence, Streitwieser's rehybridization model is of little value in rationalizing regioselectivity of annelated aromatics observed in electrophilic reactions as far as the atomic charge argument is concerned. Since Mills and Nixon studied in their original paper hydroxy derivates of fused aromatic systems like/3-hydroxyindan [1], we examined a series of model compounds possessing OH group attached at the fl-position [14]. It was found that the hydroxy group exerted overwhelming influence on the electrophilic susceptibility of the aromatic carbons by substantial activation of its ortho positions. However, the free/3' position is the most active one because of the Mills-Nixon effect, which amplifies the OH inductive effect. Hence the selectivity in the electrophilic substitution reactions is governed once again by the MN effect [14]. It is reasonable to expect that the orientational ability of heteroanalogs of benzocyclopropene 20 and 21 will be increased in view of the pronounced 7r-double bond fixation. This is indeed the case as shown by M P 2 ( f c ) / 6 - 3 1 G * / / H F / 6 - 31G* calculations [13]. It was found that the difference E(n)a-E(n)~ assumes values 11.8 kcal/mol and 9.6 kcal/mol for 20 and 21, respectively. Hence heteroatoms within the three-membered ring enhance regioselectivity of the parent benzocyclopropene framework. The fact that the NH group is more effective in this sense than an oxygen atom is traced down to be a consequence of a strong lone pair-back bonding interaction occurring in the protonated/3-form of 20 [13]. Reversed MN systems exhibit an antipodal 7r-bond fixation pattern. Consequently, the directing property of small rings fused to benzene in the electrophilic reactions should be diametrically opposite to that firmly established in the MN systems, if the underlying mechanism is the same. In other words, the propensity of the a-position to undergo electrophilic substitution reactions should be increased in compounds like 33 and 34. Calculations performed by the M P 2 ( f c ) / 6 - 3 1 G * / / H F / 6 - 3 1 G * show that the E(n)a-E(n)~ = -8.2 and -21.2 kcal/mol for the protonated forms of 33 and 34, respectively [96]. It turns out that a-sites are dramatically more susceptible toward the electrophilic attack. Their high preference is interpreted by a substantial increase in the aromatic character of the three-membered ring upon a-protonation. However, the ground state distribution of electron density might be important in some reversed MN-systems. Hence, it deserves closer scrutiny. In benzoborirene 33 we found a small difference between 7c-populations of Ca and C~ atoms, which assume values of 0.95 and 0.871e I, respectively, as obtained by the M P 2 ( f c ) / 6 - 31G* model. However, total atomic densities were practically equal being 6.141e I. Whether the slightly increased 7r-density of a-carbons contributes to their increased electrophilic reactivity is an open question. A much more significant difference in the electron population was found in benzocyclopropenyl cation 34, where 7r-densities of Ca and C~ atoms were 0.93 and 0.84 le], respectively. Concomitantly, the total atomic density of the a-carbon is higher than that of Cz (1.lie I vs. 1.071el). We are inclined to believe that redistribution of the GS electron charge density has some influence on the electrophilic reactivity in this particular system, but it is definitely not the dominating
94 factor. We note also that 7,7-difluorobenzocyclopropene exibits a mild anti MN-mode of 7rbond localization. Consequently, the c~-protonated species should be more stable than the /3-protonated form. Calculations carried out by utilizing the MP2/6-31+G** model give E~ - EZ = -1.1 kcal/mol, which is a small but conceptually important shift of the total electronic energy in the right direction [97]. An important outcome of these calculations is a conclusive evidence that the orientational ability of small annelated rings is governed by the GS memory effect dictated by specific 7r-bond fixation and interference thereupon with the ~-localization pattern triggered by protonation, which in turn simulates the transition structure. There is rich experimental evidence showing that aryl positions adjacent to strained annelated rings exhibit reduced reactivity towards electrophilic reagents [2-7,98-101]. These data are in harmony with the theoretical interpretation in terms of the bond fixation model presented above. It is possible that the ground state charge distribution in some particular fused molecular systems affects propensities of a and/~ atoms to undergo the electrophilic substitution reactions, but interpretation based solely on the rehybridization effect at carbon junction atoms offered by Siegel et al.[9] is obviously unjustified.
5.2. Miscellaneous Physical and Chemical Properties The theoretical model developed in order to rationalize the Mills-Nixon effect based on rehybridization and concomitant partial ~-bond localization, which is sometimes additionally enhanced by delocalization or hyperconjugation encompassing the annelated ring, is relevant and useful only if it can predict new chemical and physical features. An interesting test of the MN effect is provided by the relative stability of tautomers of fused systems which differ in double-bond character of the annelated bond. The latter changes abruptly in going from one tautomer to the other in a series of pyrazoles depicted in Fig. 25. A careful study of this type has been performed by Elguero and coworkers [103-105]. If the MN-picture is correct, then the tautomer with annelated bond possessing higher single bond character should be more stable. 13C NMR measurements have shown that isomer 45b is more abundant than its 45a counterpart, whereas the opposite is the case for a pair 46a and 46b. This is coherent with the MN pattern of ~-electron localization within the aromatic (pyrazole) moiety. Tautomers 44 were not amenable experimentally, but theoretical calculations predict that 45b tautomer is more stable by 2.6 kcal/mol. Complementary semiempirical and HF/6-31G* computational studies of a number of pyrazole tautomers involving also bicyclic annelated fragments show that the picture offered by MN localization is persistently correct [104]. The same holds for model systems provided by pyrazole with angularly deformed C-H bonds which simulate annelation event. A problem related to the MN-effect is the enol/enol tautomerism of some/~-dicarbonyl fragments emanating from small cyclic and bicyclic rings. Semiempirical AM1 calculations show that tautomers possessing "annelated" bond with low double bond character are energetically more favourable [105]. These results should be taken with a due caution in view of the highly approximate nature of the AM1 method. We note in passing that conjectures presented above are in accord with Brown's postulate that exo double bonds stabilize a five-membered ring whereas they destabilize a six-membered ring [106]. The generalization of this proposition indicates that reactions will proceed in such a manner
95
CH3
CH3 -"
H~N\,
I
H 44a
44b
CH3
CH3~ _..
~ H~N~NT~._~
I
H 45a
45b
,.
CH3
..._ CH3(
\
?
I
H 46a
46b
Figure 25. Some pyrazole tautomers which illustrate the MN-effect.
as to favor the formation or retention of an exo double bond in a five-membered ring. Brown et al. [106] found that this hypothesis accounted for available data with very few exceptions. Apparently, the MN effect can be put in a wider context of the chemical reactivity of organic cyclic compounds. There are a number of papers discussing a distribution of the spin-density in annelated aromatic compounds. They are related to the MN effect in so far as the results are usually interpreted by the Streitwieser's rehybridization- polarization model, where some of the a-density of Ca atoms is shifted toward ipso carbon atoms [95]. Consequently, atomic electronegativity of Ca and Ci atoms is increased and decreased, respectively, thus stabilizing 7r-MOs with large/small coefficients at C~/Ci atoms. This simple picture proved useful in rationalizing the ESR spectra of both anions and cations of MN-systems [107-113]. The same simple model was utilized in interpreting the ring strain effect on the half-wave reduction and oxidation upon fusion of a small ring with an aromatic molecule
96 [114,115]. It should be kept in mind, however, that rehybridization is an important ingredient of the MN effect, but the phenomenon is usually much more complex than that. Nevertheless, it is of great importance to have reliable information on the redistribution of s-characters since this faithfully reflects changes taking place in the a-framework of MN - or reversed MN-systems. A useful probe of the s-character is provided by J(C-H) and J(C-C) indirect spin-spin coupling constants of directly bonded atoms. An experimental study of the J(C-C) coupling constants in benzocycloalkenes reveals their increase in ortho bonds as the size of the carbocyclic ring decreases [116]. Measurements of the J(CC) coupling constants of cyclobutabenzene and its M(CO)3 complexes (M=Cr, Mo and W) are consistent with a weak but not negligible alternation of the carbon s-characters in the benzene ring CC bonds, which is consistent with a slight MN bond fixation [117]. Finally, it should be noted that basicity of the nitrogen at the a-position decreases in the series of annelated pyridines as the angular strain of the fused ring increases [118]. This is in harmony with the interpretation of the MN effect presented in Section 5.1. More work on this topic is highly desirable.
6. Concluding Remarks Compelling evidence presented here shows that the Mills-Nixon effect is not an elusive theoretical construct, but a hard fact which is rooted in a number of chemical and physical properties of annelated compounds. The underlying mechanism rests on rehybridization at the junction atoms accompanied by the 7r-electron delocalization, triggered by changes in the a-framework and by resonance (or hyperconjugation) interaction(s) involving annelated ring(s). A strong MN-effect is observed in molecules where a - and 7r-contributions act in a synergistic manner. Selectivity and control in electrophilic reactions including protonation are determined by an interplay of two ~-electron localization patterns: one induced by MN effect in the ground state (molecular memory effect) and the other occuring in the transition structure modelled by the corresponding Wheland a-complex. A degree of mismatching of these two competing localization modes significantly disfavors a-positions adjacent to a (small) ring thus leading to a remarkable regioselectivity. An interesting complementary family of fused systems is provided by compounds exhibiting reversed Mills-Nixon 7r-bond fixation. In these systems 7r-electron delocalization acting in opposition to the a-rehybridization effect is a dominant factor. The electrophilic reactivity pattern in these systems is diametrically opposite to that established in MN-compounds. These conclusions are results of our extensive ab initio calculations on the MN- and reversed MN-systems over a number of years. They are corroborated by other theoretical contributions to the field [119-122]. There is also ample experimental evidence (vide supra) which supports the ideas outlined above. Recently, the X-ray structure of triphenyltruxenene-a C48 polycyclic buckybowl precursor- was determined [123]. A small but significant MN bond alternation in the central benzene ring was found as evidenced by the ipso and ortho bond distances of 1.418(6) s and 1.387(6) s respectively. Although we discussed almost exclusively distortions of the benzene ring upon annelation in this review article, we are confident that results obtained and conclusions drawn have a general significance. Hence, it is reasonable to assume that similar deformations are to be
97 expected in larger fused aromatics. As an illustration we just offer the X-ray sructure of naphtho[b]cyclopropene [124]. Comparing these data with the X-ray geometry of the parent naphthalene [41], one can draw an unambiguous conclusion that fusion of the small three-membered ring induces additional It-bond localization in the MN-sense. This is in accordance with our theoretical results [42]. All these data fit the mosaic called the MillsNixon effect as decribed above. It is, therefore, surprising to find in the literature papers like: "X-Ray Diffraction Evidence for a Cyclohexatriene Motif in the Molecular Structure of Tris(bicyclo[2.1.1]hexeno)benzene: Bond Alternation after the Refutation of the Mills-Nixon Theory" [55]. In this communication authors fiercely combat the "theory" of Mills and Nixon, which is nothing more than the Kekul~ oscillation model of benzene. Since the studied system exibits a strong alternation within the central benzene moiety, Biirgi et al. [55] introduce a "new bicyclic strain" effect. This is hardly justified because the annelated bicyclic fragments represent just a slight generalization of the original MNsystems. In another contribution Frank and Siegel [9] state that: ... "The Mills-Nixon hypothesis has no meaning. Nonetheless, the legacy of this hypothesis doggedly persists in our research discussions and should be laid to rest." Both articles imply that other workers in the field base their research upon the Kekul@-Mills-Nixon oscillating model of annelated benzenes. Keeping with Pauling's life-long quest for truth in science (as well as in politics!) we would like to point out that these allegations are not in place. We examined all relevant papers dealing with the MN-effect and, not surprisingly, none of them involved even a word about the possible oscillations of benzene. Instead, the largest body of the examined papers discussed physical and chemical consequences of the Mills-Nixon effect. Let it be so in the future, because this chapter is not closed as yet!
Acknowledgement. We thank Dr. Howard Maskill for useful discussions.
REFERENCES 1. 2. 3. 4. 5.
W.H. Mills and I.G. Nixon, J. Chem. Soc., (1930) 2510. J.B.F. Lloyd and P.A. Ongley, Tetrahedron, 20 (1964) 2185. J. Vaughan, G.J. Welch and G.J. Wright, Tetrahedron, 21 (1965) 1665. A.R. Bassindale, C. Eaborn and D.R.M. Walton, J. Chem. Soc. B, (1969) 12. J.M. Blatchly and R. Taylor, J. Chem. Soc. B, (1964) 4641; R. Taylor J. Chem. Soc. B, (1971) 536; J.M. Blatchly and R. Taylor J. Chem. Soc. B, (1968) 1402. 6. J.L.G. Nilsson, H. Selander, H. Sievertsson and I. Skanberg, Acta Chem. Scand., 24 (1970) 580; J.L.G. Nilsson, H. Selander, H. Sievertsson and I. Skanberg and K.-G. Svensson, Acta Chem. Scand., 25 (1971) 94. 7. J. Novrocik, J. Posko~il and I. (~ep~insky, Coll. Czech. Chem. Commun., 43 (1978) 1488. 8. Z.B. Maksid, M. Eckert-Maksid, M. Hodo~ek, W. Koch and D. Kovahek, in Molecules in Natural Science and Medicine. An Enconium for Linus Pauling, Ellis Horwood, Chichester, 1991, p. 333. N.L. Frank and J.S. Siegel, Adv. Theor. Interesting Mols., 3 (1995) 209. 10. R. Taylor, Electrophilic Aromatic Substitution, Ellis Horwood, Chichester, 1990. ,
98 11. M. Eckert-MaksiS, Z.B. Maksi(~ and M. Klessinger, Int. J. Quant. Chem., 49 (1994) 383. 12. M. Eckert-MaksiS, Z.B. Maksi5 and M. Klessinger, J. Chem. Soc. Perkin Trans. 2, (1994) 285. 13. Z.B. Maksi~, D. Kova(:ek and B. Kova(:evi~, El. J. Theor. Chem., I (1996) 65. 14. M. Eckert-Maksi~, M. Klessinger, D. Kova(:ek and Z.B. Maksi~, J. Phys. Org. Chem., 9 (1996) 269. 15. L.E. Sutton and L. Pauling, Trans. Farad. Soc., 31 (1935) 939. 16. J.S. Siegel, Angew. Chem. Int. Ed. Engl., 33 (1994) 1721. 17. R. Boese, D. BlUer, W.E. Billups, M.M. Haley, A.M. Maulitz, D.L. Mohler and K.P.C. Vollhardt, Angew. Chem. Int. Ed. Engl., 33 (1994) 313. 18. K.K. Baldridge and J.S. Siegel, J. Am. Chem. Soc., 114 (1992) 9583. 19. A. Stanger and K.P.C. Vollhardt, J. Am. Chem. Soc. 53 (1988) 4889. 20. A. Stanger, J. Am. Chem. Soc., 113 (1991) 8277. 21. O. M6, M. Yafiez, M. Eckert-Maksi~ and Z.B. Maksi~, J. Org. Chem., 60 (1995) 1638. 22. R.H. Mitchell, P.D. Slowey, T. Kamada, R.V. Williams and P.J. Garratt, J. Am. Chem. Soc., 106 (1984) 2431. 23. M.J. Collins, J.E. Gready, S. Sternhell and C.W. Tansey, Austr. J. Chem., 43 (1990) 1547. 24. M. Barfield, M.J. Collins, J.E. Gready, P.M. Hatton, S. Sternhell and C.W. Tansey, Pure & Appl. Chem., 62 (1990) 463. 25. K. Hedberg, L. Hedberg, D.S. Bethune, C.A. Brown, H.C. Dorn, R.D. Johnson and M. de Vries, Science, 254 (1991) 410. 26. W.I.F. David, R.M. Ibberson, J.C. Matthewman, K. Prassides, T.J.S. Dennis, J.P.Harre, H.W. Kroto, R. Taylor and D.R.M. Walton, Nature, 353 (1991) 147. 27. J.M. Hawkins, A. Meyer, T.A. Lewis, S. Loren, F.J. Hollander, Science, 252 (1991) 312. 28. Th. FSrster, Z. Phys. Chem., B 43 (1939) 58. 29. C.A. Coulson and W.E. Mofitt, Phil. Mag., 40 (1949) 1. 30. M. Randi~ and Z.B. Maksid, Theor. Chim. Acta, 3 (1965) 59. 31. For a review see: Z.B. Maksi~, in Theoretical Models of Chemical Bonding, Part 2, Z.B. Maksid (ed.), Springer Verlag, Berlin-Heidelberg, 1990, p. 137. 32. Z.B. Maksid, M. Eckert-Maksi~, D. Kovaeek and D. Margeti~, J. Mol. Struct.(Theochem), 260 (1992) 241. 33. J.P. Foster and F.Weinhold, J. Am. Chem. Soc., 102 (1980) 7211; A.E. Reed and F. Weinhold, J. Chem. Phys., 78 (1983) 4066. 34. A.E. Reed and F. Weinhold, J. Chem. Phys., 78 (1983) 1736. A.E. Reed, L.A. Curtis and F. Weinhold, Chem. Rev., 88 (1988) 899. 35. For a review of the existing methods of apportioning the electron density in molecules see: K. Jug and Z.B. Maksi(~, in Theoretical Models of Chemical Bonding, Vol.3, Z.B. Maksi(~ (ed.), Springer Verlag, Berlin-Heidelberg, 1991, p. 235. 36. A.D. Walsh, Disc. Farad. Soc., 2 (1997) 18. 37. H.A. Bent, Chem. Rev., 61 (1961) 275. 38. D.L. Mohler, K.P.C. Vollhardt and S. Wolff, Angew. Chem., Int. Ed. Engl., 29 (1990)
99 1151. 39. D. Kova~ek, D. Margetid and Z.B. Maksid, J. Mol. Struct. (Theochem), 285 (1993) 195. 40. G. Filippini, J. Mol. Struct., 130 (1985) 117 and references cited therein. 41. C.P. Brock and J.D. Dunitz, Acta Cryst. B, 38 (1982) 2278. 42. M. HodoK:ek, D. Kova~ek and Z.B. Maksid, Theoret. Chim. Acta, 86 (1993) 343. 43. M.Eckert-Maksid, Z. Glasovac B.Kova(:evid and Z.B.Maksid, to be published. 44. The energy required for partial s-electron localization seems to be small. In fact, it is possible that s-electrons alone prefer to be localized. See e.g.S.S. Shaik and P.C. Hiberty, J. Am. Chem. Soc., 107 (1985) 3089; S.S. Shaik, P.C. Hiberty, G. Ohanessian and J.-M. Lefour, J. Chem. Phys., 92 (1988) 5086. 45. F.H. Allen, Acta Cryst.B, 37 (1981) 900. 46. M. Eckert-Maksid, Z.B. Maksid, M. Hodog~ek and K. Poljanec, J. Mol. Struct. (Theo-
chem),
(1993) 187.
47. E.R. Boyko and P.A. Vaughan, Acta. Cryst., 17 (1964) 152. 48. M. Eckert-Maksi5, D. Kova~ek, M. Hodog~ek, D. Miti5, K. Poljanec and Z.B. Maksid, J. Mol. Struct. (Wheochem), 206 (1990) 89. 49. R. Boese and D. Bl•ser, Angew. Chem., Int. Ed. Engl., 27 (1988) 304. 50. M. Eckert-Maksid, Z.B. Maksid, M. Hodog~ek and K. Poljanec, Int. J. Quant. Chem., 42 (1992)869. 51. R. Neidlein, D. Christen, V. Poignee, R. Boese, A. Gieren, C. Ruiz-Perez and T. Hiibner, Angew. Chem. Int. Ed. Engl., 27 (1988) 295. 52. Z.B. Maksi~ and M. Eckert-Maksi~, Croat. Chem. Acta, 42 (1970) 433. 53. M. Eckert-Maksi~ and Z.B. Maksi~, J. Mol. Struct. (Theochem), 86 (1982) 325. 54. G. Runtz, R.F.W. Bader and R.R. Messer, Can. J. Chem., 55 (1977) 3040. 55. H.-B.Biirgi, K.K. Baldridge, K. Hardcatle, N.L. Frank, P. Gantzel, J.S. Siegel and J. Ziller, Angew. Chem. Int. Ed. Engl., 34 (1995) 1454. 56 M.J.S. Dewar and H.N. Schmeising, Tetrahedron, 11 (1960) 96. 57 C.C. Costain and B.P. Stoicheff, J. Chem. Phys., 30 (1959) 777. 58 K. Kova~evi~ and Z.B. Maksi~, J. Org. Chem., 39 (1974) 539. 59 Z.B. Maksi~ and A. RubriC, J. Am. Chem. Soc., 99 (1977) 4233. 60 Y. Apeloig and D. Arad, J. Am. Chem. Soc., 108 (1986) 3241. 61 M. Eckert-Maksi~, Z. Glasovac and Z.B. Maksi~, J. Organomet. Chem., in print. 62 W.M. Stigliani, V.W. Laurie and J.C. Li, J. Chem. Phys., 62 (1975) 1890. 63. D. Kova~ek, Z.B. Maksi~ and I. Novak, J. Phys. Chem. A, 101 (1997) 1147. K. Gomann and U.H. Brinker, J. Am. Chem. Soc., 111 (1989) 64. R. Boese, D.Bls 1501. 65. M. Eckert-Maksi~, M. Hodo~6ek, N. Novak-Doumbuya and Z.B. Maksi~, in preparation. 66. W. Winter and T. Butters, Acta Cryst. B, 37 (1981) 1524. 67. W. Koch, M. Eckert-Maksid and Z.B. Maksi~, Int. J. Quant. Chem., 48 (1993) 319. 68. M.V. Capparelli, R. Machado, Y. De Sanctis and A.J. Aree, Acta Cryst. C, 52 (1996) 947. 69. L. Moore, R. Lubinski, M.C. Baschky, G.D. Dakhle, M. Hare, T. Arrowood, Z. Glasovac, M. Eckert-Maksi~ and S.R. Kass, J. Org. Chem., 62 (1997) 7390.
100 70. M. Eckert-Maksid, A. Lesar and Z.B. Maksi5, J. Chem. Soc., Perkin Trans. 2, (1992) 993. 71. M. Hodo~5ek, D. Kovaeek and Z.B. Maksid, J. Mol. Struct. (Theochem), 281 (1993) 213. 72. N.L. Frank, K.K. Baldridge, P. Gantzel and J.S. Siegel, Tetr. Lett., 36 (1995) 4389. 73. N.L. Frank, K.K. Baldridge and J.S. Siegel, J. Am. Chem. Soc., 117 (1995) 2102. 74. F. Cardullo, D. Giuffrida, F.H. Kohnke, F.M. Raymo, J.F. Stoddart and D.J. Williams, Angew. Chem. Int. Ed. Engl., 35 (1996) 339. 75. K.P.C. Vollhardt, Pure Appl. Chem., 65 (1993) 153 and references cited therein. 76. B.C. Berris, G.H. Hovakeemian, Y.H. Lai, H. Mestagh and K.P.C. Vollhardt, J. Am. Chem. Soc., 107 (1985) 5670. 77. M. Eckert-Maksid, M. Hodo~eek, D. Kovaeek, Z.B. Maksid and K. Poljanec, Chem. Phys. Lett., 171 (1990) 49. 78. Z.B. Maksid, D. Kovaeek, M. Eckert-Maksi5, M. BSckmann and M. Klessinger, J. Phys. Chem., 99 (1995) 6410. 79. Z.B. Maksid and D. Kova~ek, in preparation. 80. R. Diercks and K.P.C. Vollhardt, J. Am. Chem. Sot., 108 (1986) 3150. 81. D. Kova~ek, D. Margetid and Z.B. Maksid, J. Mol. Struct. (Theochem), 285 (1993) 195. 82. Z.B. Maksid, M. Eckert-Maksid and K.-H. Pfeifer, J. Mol. Struct., 300 (1993) 445. 83. W. Koch, M. Eckert-Maksid and Z.B. Maksid, J. Chem. Soc. Perkin 2, (1993) 2195. 84. M. Eckert-Maksid, Z. Glasovac, M. Hodo~ek, A. Lesar and Z.B. Maksid, J. Organomet. Chem., 524 (1996) 107. 85. E.A.C Lucken, J. Chem. Sot., (1959) 2954; J.F.A. Williams, Tetrahedron, 18 (1962) 1477; A.E. Reed and P.V.R. Scheyer, J. Am. Chem. Soc., 115 (1993) 614. 86. For alternative view see: K.B. Wiberg and P.R. Rablen, J. Am. Chem. Soc., 115 (1993) 614. 87. A. Stanger, N. Ashkenazi, R. Boese and P. Stellberg, J. Organomet. Chem., 542 (1997) 19. 88. T.J. Lynch, M.C. Helvenston, A.L. Rheingold and D.L. Staley, Organomet., 8 (1989) 1959. 89. G.W. Wheland, J. Am. Chem. Soc., 64 (1942) 900. 90. P. George, M. Trachtmann, C.W. Bock and A.M. Brett, J. Chem. Soc. Perkin Trans. 2, (1976)317. 91. P. George, M. Trachtmann, A.M. Brett and C.W. Bock, J. Chem. Soc. Perkin Trans. 2, (1997) 1036. 92. M. Eckert-Maksid, W.M.F. Fabian, R. Janoschek and Z.B. Maksid, J. Mol. Struct. (Theochem), 338 (1995) 1. 93. M. Eckert-Maksid, M. Klessinger and Z.B. Maksid, Chem. Eur. J., 2 (1996) 1251. 94. Z.B. Maksid, M. Eckert-Maksid and M. Klessinger, Chem. Phys. Lett., 260 (1996) 572. 95. A. Streitwieser, Jr., G.R. Ziegler, P.C. Mowery, A. Lewis and R.G. LaMer, J. Am. Chem. Soc., 90 (1968) 1357. 96. M. Eckert-Maksid, Z. Glasovac, Z.B. Maksid and I. Zrinski, J. Mol. Struct. (Theochem), 366 (1996) 173.
101 97. 98. 99. 100. 101.
O. MS, M. Yafiez, M. Eckert-Maksi5 and Z.B. MaksiS, to be published. R. Taylor, G.J. Wright and A.J. Homes, J. Chem. Soc. B, (1967) 780. J.H.P Utley and T.A. Vaughan, 3. Chem. Soc. Perkin 2, (1972) 2343. P.J. Garratt and D.N. Nicolaides, J. Org. Chem., 39 (1974) 2222. B. Sket and M. Zupan, J. Org. Chem., 43 (1978) 835; B. Zajc and M. Zupan, Tetrahedron, 45 (1989) 7869. 102. H. Tanida and R. Muneyuki, Tetrahedron Lett., (1964) 2787. 103. A. Martinez, M.L. Jimeno, J. Elguero and A. Fruchier, New. J. Chem., 18 (1994) 269. 104. I. Alkorta and J. Elguero, Struct. Chem., 8 (1997) 189. 105. M. Ramos, I. Alkorta and J. Elguero, Tetrahedron, 53 (1997) 1403. 106. H.C. Brown, J.H. Brewster and H. Schechter, J. Am. Chem. Soc., 76 (1954) 467. 107. R.D. Rieke, C.F. Meares and L.I. Rieke, Tetrahedron Lett., (1968) 5275. 108. R.D. Rieke and W.E. Rich, J. Am. Chem. Soc., 92 (1970) 7349. 109. R.D. Rieke, S.E. Bales, P.M. Hundall and C.F. Meares, J. Am. Chem. Soc., 93 (1971) 697. 110. R.D. Rieke, S.E. Bales, C.F. Meares, L.I. Rieke and C.M. Milliren, J. Org. Chem., 39 (1974) 2276. 111. A.G. Davies and K.M. Ng, J. Chem. Soc. Perkin Trans. 2, (1992) 1857. 112. D.V. Avila, A.G. Davies, E.R. Li and K.M. Ng, J. Chem. Soc. Perkin Trans. 2, (1993) 355. 113. A.G. Davies, G. Gescheidt, K.M. Ng and M.K. Shepherd, J. Chem. Soc. Perkin Trans. 2, (1994) 2423. 114. R.D. Rieke, W.E. Rich and T.H. Ridgway, Tetrahedron Lett., (1969) 4381. 115. R.D. Rieke, W.E. Rich and T.H. Ridgway, J. Am. Chem. Soc., 93 (1971) 1962. 116. H. Giinther and W. Herrig, J. Am. Chem. Soc., 97 (1975) 5594. 117. H. ButenschSn, B. Gabor, R. Mynott and H.G. Wey, Z. Naturforch., 50b (1995) 483. 118. See ref.[9] and papers cited therein. 119. P.C. Hiberty, G. Ohanessian and F. Delbecq, J. Am. Chem. Soc. 107 (1985) 3095. 120. R. Benassi, S. Ianelli, M. Nardelli and F. Taddei, J. Chem. Soc. Perkin Trans. 2, (1991) 1381. 121. R. Faust, E.D. Glendening, A. Streitwieser and K.P.C. Vollhardt, J. Am. Chem. Soc., 114 (1992)8263. 122. E. Lewars, J. Mol. Struct. (Theochem), 360 (1996) 67. 123. M.J. Plater, M. Praveen and A.R. Howie, J. Chem. Res. (S), (1997) 46. 124. W.E. Billups, W.Y. Chow, K.H. Leavell, E.S. Lewis, J.L. Margrave, R.L. Sass, J.J. Shieh, P.G. Werness and J.L. Wood, J. Am. Chem. Soc., 95 (1973) 7878.
This Page Intentionally Left Blank
Z.B. Maksid and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond
103
Theoretical and Computational Chemistry,Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
P r e d i c t i n g s t r u c t u r e s of c o m p o u n d s i n t h e solid s t a t e b y t h e g l o b a l optimisation approach J. C. SchSn, M. Jansen Institut fiir Anorganische Chemie, Universit~it Bonn, Gerhard-Domagk-Str. 1, D-53121 Bonn, Germany 1. I N T R O D U C T I O N The ability to predict the existence of hypothetical compounds under given thermodynamic boundary conditions, and also to develop realistic routes for their synthesis, is a major step towards one of the main goals of chemistry, the detailed planning of the synthesis of new compounds. In this chapter, we will concentrate on the first aspect of the planning of syntheses, the determination of hypothetical compounds that are capable of existence. Already with respect to the predictability of (meta)stable compounds, there exist dramatic differences among the different branches of preparative chemistry. In the molecular chemistry of carbon and other nonmetals, an impressive ability in the design, i.e., the prediction of such molecules, has been reached 1, 2. In preparative solid state chemistry, however, it is neither possible to predict, whether some hypothetical compound can exist at all nor what structure(s) it would exhibit. In most cases, the research is experimental in the classical sense of the word. Of course, the success of such an exploration can be improved through clever choices of educts, and appropriate temperature and pressure programs during the synthesis. But a control of the reactions involved that would be sufficient for an a-priori design of a specific synthesis route is not available at this time. al On first sight, this discrepancy is puzzling. After all, molecules and solids obey the same laws of quantum mechanics and statistical mechanics that tell us that the existence and stability of a compound are derived from the properties of the energy function of the chemical system. But it appears nearly impossible to predict even the existence of stable solid phases with a given fixed composition, not to mention all the metastable phases that might exist 3. Why should Li3N exist but not Na3N or Cs3N? Why is TiO2 found in a large number of stable modifications, e.g. rutile, anatas and brookite, while for MgF2 only one structure has been reliably reported? Indubitably, it is oi~en possible to estimate whether a suggested structure of a previously unknown compound is reasonable or strange; and there exist a number of empirical rules, e.g. the radius-ratio-rule 46, the valence electron concentration rule 7, 8, Pauling's bond-valence rules 6, 9, the analysis of partial Madelung constants 1~ that help us to pronounce judgement, after the fact. But even with The reason for this difference regarding the control of the synthesis itself becomes clear upon considering the different (thermodynamic) conditions and processes that pertain to the molecular and solid state reactions, respectively. The regio specificity of reactions in molecules and the mainly kinetic control of the reactions together constitute an efficient toolbox for the planning of molecular syntheses, while similar tools are usually not available for solid state syntheses.
al
104 the help of these rules and some empirical informations about (formal) ionic charges and ionic radii, or some guidelines about the typical n u m b e r of bonds a certain atom usually tries to establish, it is very difficult even to derive threedimensional structures for a hypothetical compound of a given composition a2 . And we are not even addressing the question, which structure might be the thermodynamically stable one (if any exists at all), or whether any of the competing ones might be kinetically stable. Not knowing, where to go, planning a synthesis route is even more fraught with perils, of course. The reason for the failure of common sense to decide on the existence of a solid compound and predict its structure can be found in the difficulty to write down a simple energy fimction that is dominated by a restricted set of local interactions, as is e.g. the case for an organic molecule. There, the local environment of an atom within a suggested molecule is dominated by the bonds to the neighbouring atoms and to a lesser degree by spatial packing requirements, which reflects the fact that the energy function for those kinetically stable molecules the molecular chemist usually encounters can be described by a sum of local (short-range) interactions, leading to a hierarchical structure in the construction of a molecule a3 . Of course, for a given existing structure of a solid, one can always construct a-posteriori some purely short-range potential containing empirical parameters that are fitted to e.g. certain elastic constants and cell parameters. Because of the way it was constructed, such a potential can be used to justify, but not to predict, the local coordination-polyhedra one observes. However, since such a very specific potential cannot be easily transferred even to another polymorphic modification of a given compound, it usually lacks the versatility inherent in the successful potentials used to describe the interactions within molecules. Most "general" forms of effective energy functions need to contain terms t h a t describe long-range interactions, if they are to be of any use in comparing m a n y hypothetical different structures of an u n k n o w n compound. However, once such terms are being introduced, the simple building blocks one needs to construct consistent structures, in analogy to the visualisation of a molecule, become increasingly fuzzy. In addition, we are dealing with practically infinite three-dimensional solids; and thus the number of possible arrangements of simple building blocks like coordination-polyhedra into periodic structures appears to be limitless, not to mention the possibility of amorphous structures and quasicrystals. Apart from heuristic arguments like "each atom of a given type should have the same environment" or "the structure with the higher symmetry should be more prevalent", there appears to be no obvious way to quickly choose among configurations that exhibit the same local structure but are different otherwise. While it is still impossible to safely predict new and interesting solids by intuition or simple rules alone, with the advent of powerful computers numerical investigations of the hypersurface of the potential energy (energy landscape) of the chemical system as function of the atomic/ionic coordinates have become a2 Some approaches 11"15 in this direction employ a-priori knowledge about bonding preferences and/or spacial constraints, ot~en assuming a known unit cell, together with bond-valence rules to suggest possible structures. a3 Since "allowed" local configurations are well-known from experience, both with respect to their geometry and their stability, any presumably stable molecule can be easily visualised and constructed using these local configurations as a system of stable structural increments. On the next level, one can investigate the secondary (topological) structure consisting of rings, clusters, etc., which again must follow certain simple rules based on size and bonding angle requirements.
105 feasible as will be described in this chapter. Although the main goal of such a detailed study of the energy hypersurface is the recognition of promising synthetic targets and the evaluation of their kinetic and thermodynamic stability, the neighbourhood structure of e.g. the set of local minima may even yield hints for possible synthetic routes.
2. T H E E N E R G Y L A N D S C A P E In principle, the prediction of metastable compounds of a chemical system should consist of two steps: First we solve the SchrSdinger equation of the system, and t h e n we analyse the statistical mechanical properties and the stability of these solutions. However, even the first step is not feasible in the case of a solid, for in general, the eigenfunctions of a solid will be some complicated functions of both the ionic and the electronic degrees of freedom. Every "classical ionic configuration", and its "excited states" will be a linear combination of m a n y such eigenfunctions. Because of the large n u m b e r of ions and electrons involved, trying to determine these functions and their eigenvalues is not yet possible. Thus, one tries to reduce the problem by assuming t h a t a) the electronic and ionic degrees of freedom can approximately be separated (Born-Oppenheimer approximation16), and b) the ionic part of the wave function will, in addition, be approximately separable into N well-defined nearly-classical localized particles plus a (negligible) set of zeropoint vibrations a4 . If this is possible, we can construct an ("electronic ground state") energy hypersurface as a parametric function of classical ionic configurations alone, which is the intuitively relevant quantity, since one usually associates a metastable compound with some specific atomic/ionic configuration. Note t h a t the energy is a function of the 3N-dimensional configuration space, which m u s t not be confused with the thermodynamic space. From a conceptual point of view, one observes t h a t the classical picture of the chemical bond is in the threefold Hegelian sense "aufgehoben" within the concept of the energy landscape, i.e., the energy landscape at the same time removes, preserves, and raises to new heights of sophistication our understanding of the chemical bond. As a consequence, simplifications and approximations involved in the actual calculation of the energy, as discussed in section (4.1), often lead us back to the older concepts of distinct types of bonds, e.g. ionic or covalent ones. The next step in the prediction of metastable compounds consists in the exploration of this energy landscape. For T = 0 K, it is very clear what constitutes a (meta)stable structure: It is the one, and only one, configuration that is associated with a local m i n i m u m of the energy hypersurface. Thus, for T = 0 K our task is to determine all local minima of the energy landscape. In contrast, for T > 0 K, equiliClearly, there are some obvious problems: there are many systems, where electron-phonon interactions play an important role in stabilizing or destabilizing an otherwise unstable/stable structure. Thus, both the Born-Oppenheimer approximation and the assumption of negligible zero-point vibrations can fail17"19. Nevertheless, these issues are somewhat beside the point: While structural details (distortions away from a high symmetry, precise dimensions of the unit cell), electronic properties (metal, insulator) and the degree of stability of the compounds may change upon analyzing a given structure in greater detail, we are concerned in the first place with the existence and the general structure type of a hypothetical compound. Unless there are reasons to believe that the neglected quantum mechanical effects will destroy or create important configurations outright, the construction of the (potential) energy hypersurface is a good first step. a4
106 brium statistical mechanics tells us that each configuration 'T' with energy Ei has a non-zero probability p(i) = exp (- Ei/kBT) / Z , Z = ~ exp (- Ej/kBT) J
(1)
of being present at a given time. This implies that the concept of a single unique configuration t h a t represents the system at T > 0 K at all times does no longer apply. Now, a metastable compound corresponds to some (as yet unidentified) region R of the energy landscape, and the structure of "R" is given by the timeaverage over all the configurations within R. Thus, the equilibrium probability that the system will be found in some configuration belonging to R is given by p(R) = Z
Z(R) cxp(- F(R)/kBT) p(i) = ~ = exp(- F/kBr)
(2).
ieR
Here, Z(R) is the partition function for the region R of phase space, and F(R) =- kBT ln(Z(R))
(3)
is defined as the "free energy" for structure R. Clearly, finding the most likely structure representing the system at temperature T corresponds to a) identifying appropriate regions R and b) calculating the restricted free energy F(R) and finding the region R* which minimizes F(R). The compound belonging to the region R* would be termed "thermodynamically stable". Thus, e.g. at T = 0 K, the thermodynamically stable structure corresponds to the global m i n i m u m of the energy function, while all the other local minima are identified with only kinetically stable structures. However, the situation is considerably more subtle for T > 0 K, since it is not at all obvious, how to identify physically reasonable regions R. For this, we need to properly define kinetic stability of a compound. This requires a carefid analysis of the relevant time scales of the region R that is associated with the compound. On one hand, there is the equilibration time ~eq(R;a) that is defined as the time the system requires to reach equilibrium within a given accuracy a inside the region R of the energy landscape 2~ 21. Here it is assumed that the system is not allowed to leave R during the computation of ~cq(R). On the other hand, there exists an escape time ~csc(R;b), which is defined as the time it takes the system to leave R and move into the rest of the energy landscape with a given probability b, e.g. it corresponds to the time a metastable compound takes to dissociate or go through a phase transition into a more stable phase. If ~esc(R) > ~eq(R), it would be possible to equilibrate the system within R before such a transition takes place. Thus one could replace the time-average of any measured observable along some trajectory within R by an ensemble average within R, and we define the system to be locally ergodic as in R on the time scale a5 The concept of ergodicity is fundamental to many applications of statistical mechanics22. Any actual measurement of a physical quantity is a time-average along the trajectory in the 6Ndimensional phase space (3N coordinates and 3N velocities for N interacting particles) the system one investigates follows as a function of time. But if the system is in thermodynamic equilibrium
107
Xeq(R). I f n o w t h e t i m e scale on w h i c h we p e r f o r m o u r m e a s u r e m e n t s (e.g. a p o w d e r diffraction m e a s u r e m e n t ) , tobs, is s m a l l e r t h a t Xesc(R), 'l:esc(R) > tobs > 'l:eq(R),
(4)
then R is associated with a kinetically stable compound as . W h a t constitutes a kinetically stable c o m p o u n d depends therefore on the interplay of three different time scales. Furthermore, both Xeq(R)and Xesc(R) strongly vary with temperature. In principle, it would therefore be necessary to proceed as follows: For a given temperature T, choose some region R of the energy landscape, calculate Xcq(R;T) and Xesc(R;T)and determine, whether R is locally ergodic and on what time scale compared with relevant observation times. Repeat this for all possible regions, until all kinetically stable regions are found. For each R, the structure of the corresponding metastable c o m p o u n d is given by the statistical mechanical average over all configurations i e R. If desired, one can also compute the local free energy F(R) for each region R and find the thermodynamically stable one R* through minimisation ofF(R). Obviously, this procedure is very expensive computationally a7 . In some cases, a shortcut is possible, if the region R is separated by energy barriers of height EB (measured with respect to the m i n i m u m of R) from the rest of the energy landscape. Then, one often finds an Arrhenius law for the escape time 'l:esc(R) ": exp (+EB/kBT),
(5)
which m e a n s that Xesc(R) grows exponentially with Ifr. Since in most compounds Xcq(R) does not increase exponentially with decreasing temperature, R will usually be kinetically stable below some (possibly very low) temperature. Thus, at low temperatures, local m i n i m a will often be associated with locally ergodic regions as leading to the following recipe for the prediction of metastable compounds: First one finds the local m i n i m a of the hypersurface of the potential energy, called and the measurement time is long enough, one can often replace such a particular time-average of the observable by an average over a whole ensemble of copies of the system that are weighted according to the equilibrium (Boltzmann) distribution. If such a replacement is possible, one says that the system is ergodic. For systems that exhibit metastability, global ergodicity on all time scales does not hold any more, and one speaks of "broken ergodicity''23. Often, local ergodicity replaces global ergodicity, but even this property does not necessarily hold in all systems, e.g. spin glasses might not be locally ergodic on any time scale.
a6 For observationtimes tobs< Xeq(R),thislocalergodicityis usuallybroken. a7 F(R) can be calculatedin several ways: from Z(R), ifthis quantity is available(cf.section3), from p(R) based on a very long M C / M D simulation, or by computing the free energy difference AF(R--#0)with respect to a reference system with known free energy F0: FfR) = F0- AF(R-~0).The most common methods used to calculatefree energy differencesare "free energy integration", "free energy perturbation"and "finitetime variation";an overview together with discussionsof computationallyefficientimplementations isgiven in the references24"26. a8 Clearly,there can existadditionallocallyergodicregions,which are separated from the rest of the energy landscape by entropicinstead of energeticbarriers.As an example, picture a valley that is surrounded by high mountains with only one very narrow exit at ground level (no energetic barrier!).In N dimensions, the time to find this exitby walking randomly inside the valleycan be much longer than the equilibrationtime within the valley.
108
"structure candidates". This is followed by the determination of the energy barriers EB around these minima, in order to judge their kinetic stability. A method to calculate energy barriers and local densities of states is described in section 3. Finally, one can determine the thermodynamically stable region R* by computing F(R) for each kinetically stable region R and finding the minimum ofF(R). At high temperatures, the identification of locally ergodic regions of the energy landscape will be considerably more difficult, however, since the physically relevant regions can be considerably more extended, enclosing many local minima of the potential energy hypersurface, e.g. if T is above the transition temperature of a second order phase transition, or in a range, where supercooling is possible. Also, entropic barriers would be expected to play a much more prominent role t h a n at low temperatures. So far no simple characteristic for identifying locally ergodic regions has been suggested for these cases, and the very detailed investigation of the whole energy landscape outlined above appears to be required.
3. T H E LID- AND T H E T H R E S H O L D - A L G O R I T H M The approach presented here 2~ 21, 27 concentrates on achieving as complete as possible a description of regions of the energy landscape close to deep-lying minima. Starting from such a minimum, xo, the pocket R(L,x0) in configuration space t h a t can be reached from the starting point without crossing a prescribed energy lid L is searched - exhaustively for discrete systems (the "lid method") 2~ or statistically via random walks for continuous energy landscapes (the "threshold algorithm") 21. In the case of the lid method, all the states within the pocket and the connections among them are noted. This procedure is repeated for an increasing sequence of energy lids up to a highest value, Lmax, and yields both the number of states and the n u m b e r of local minima with energy E accessible from xo using paths below a given lid, n(E;L,x0) and m(E;L,x0), respectively. This procedure is repeated for all the local minima x present within the pocket. Thus, the local density(ies) of states, i.e. the density(ies) of states restricted to this pocket R(L,x) in state space, g(E;L,x), together with the energy barriers between the local minima within the pocket can be determined. Based on these results, the statistical mechanical properties of the system can be studied, as long as it remains within the prescribed region of phase space a9 , since the local partition function Z(R) for R(L,x) follows directly from the local density of states Z(R) =
~ i ~ R(L,x)
exp(- Ei/kaT)= ~
g(E;L,x) exp(-E/kBT)
(6)
E_
Since for a continuous energy landscape the number of states below any lid of interest is infinite for all practical purposes, this exhaustive search procedure a9 In addition, this detailed knowledge of the energy landscape near a deep-lying minimum allows the calculation of the relaxation behaviour within this pocket. The procedure is based on the fact that from the connectivity matrix of the pocket we can derive 20, 21 the transition probability matrix M(T) that describes the time evolution of the system restricted to the pocket for temperature T: p(n) = Mn p(0), where p(i;n) is the probability that the system is in state "i" at timestep n.
109
c a n n o t be applied directly al~ . Thus, i n s t e a d of t r y i n g for complete i n f o r m a t i o n about t h e e n e r g y l a n d s c a p e w i t h i n the pocket r i g h t away, we replace the exh a u s t i v e search by a statistical one in the following m a n n e r : S t a r t i n g from the local m i n i m u m u n d e r consideration, xo, a r a n d o m w a l k w i t h a physically reason a b l e m o v e c l a s s all is performed, w h e r e every step is accepted as long as the prescribed energy threshold, L, is not crossed. D u r i n g this walk, the energy landscape is sampled, and the n u m b e r of states with energy E accessible from xo using p a t h s below the lid, n(E;L,x0), can be e s t i m a t e d up to a scale factor. If the n u m b e r of samples is sufficiently large, n(E;L,x0) agrees w i t h the density of s t a t e s within the pocket, g(E;L,x0), up to the m i s s i n g n o r m a l i z a t i o n factor a12 . Such a r a n d o m walk would be termed a single "threshold run". In addition, d u r i n g the run, the s y s t e m is quenched into the n e a r e s t local m i n i m u m . F o r lid v a l u e s j u s t above the e n e r g y of the s t a r t i n g m i n i m u m , the quench will a l w a y s r e t u r n the s y s t e m into its s t a r t i n g configuration. This will change upon increasing the threshold, and at some point, a quench r u n will end in a m i n i m u m x I different from the s t a r t i n g point xo. The lowest lid value where this occurs is t h u s a n u p p e r bound for the height of the energy b a r r i e r b e t w e e n these two local minima. Like in the discrete case, such threshold r u n s are performed for all chosen v a l u e s of t h e e n e r g y lid, u s i n g as s t a r t i n g points all local m i n i m a encountered d u r i n g the original m i n i m i z a t i o n or a n y of the preceding t h r e s h o l d runs. Obviously, the above procedure results in a tree-like b a r r i e r structure. Since the n u m b e r of states available grows very fast with energy, the s y s t e m will spend most of the r a n d o m walk within the region of phase space with energies j u s t below the lid. Thus, for m o s t realistic systems, a r a n d o m walk of affordable l e n g t h will not s a m p l e e n o u g h of the r e l a t i v e l y r a r e low-lying s t a t e s to d r a w satisfactory conclusions from n(E;L,x0) about the functional form of g(E;L, x0). A solution to this problem consists in u s i n g the overlap of the distributions n(E;L,x0) for different energy lids, to d e t e r m i n e g(E;L,x0) even for high values of the threshold L. As long as no additional m i n i m a with their concomitant states are added while proceeding from lid Li to lid Li+l > Li, no special problems arise by using the overlap procedure for boot-strapping. However, if such a new region becomes suddenly accessible at Lk, a l a r g e r n u m b e r of low-lying s t a t e s below Lk-1 are now r e p r e s e n t e d in the s a m p l i n g n(E;Lk,x0) t h a n h a d been available for n(E;Lk.l,X0). Thus, one h a s to correct for this effect when using the overlap procedure, often a non-trivial task.
This would require an appropriate discretization of the configuration space, which is in most cases not feasible computationally. a l l moveclass = set of neighbouring configurations in phase space that can be reached from a given point with one step of the random walk. A physically reasonable moveclass does allow only those moves of the system that might occur during the regular time-evolution of the system, e.g., small displacements of atoms in a solid, etc. If the only goal is to reach the global minimum as fast as possible, it is of course useful to try to improve the optimisation algorithm by including non-physical moves, avoiding or "eliminating" metastable minima in the process. Obviously, for our purposes such a "high-efficiency" moveclass is a double-edged sword. Thus, during the optimisation (cf. sections 4 and 5) we have included only those non-physical moves that remove or exchange whole ions in the simulation cell, since these operations will only in very special cases lead to the elimination of realistic sub-optima, and even these moves are no longer allowed when we investigate the kinetic stability of the structure candidates. a12 If it is possible to calculate the matrix of second derivatives for each minimum analytica!ly, one can find the normal modes and use their density of states to determine the missing normalization factor for each minimum. al0
110 The combination of all the local densities of states, g(E;L,xi), represents a lumped picture of the pocket. This description is intermediate between the overall density of states for the whole pocket g~;Lmax) and the exhaustive description of every microscopic detail of the energy landscape within the pocket one can achieve in the discrete case. With this information, it is possible 21 to construct a transition matrix M(T) in the lumped configuration space t h a t allows the simulation of the evolution of the system for temperature T, and thus yields estimates for ~ ( R ) and xesc(R).
4. S T R U C T U R E P R E D I C T I O N A T LOW T E M P E R A T U R E S 4.1. G e n e r a l a s p e c t s For the r e m a i n d e r of the chapter, we will concentrate on the case presumably most i m p o r t a n t for crystalline compounds, where kinetically stable structures are associated with local minima of the energy landscape. The determination of all the local minima of the energy landscape requires the use of global optimisation methods. Since one single run of a global optimisation procedure typically involves hundreds of thousands or millions of energy evaluations, and it is in general necessary to perform hundreds of such runs in order to acquire sufficient statistics on the distribution of deep-lying minima, any reduction in the number of function evaluations involved in the calculation of the energy is of great importance. It is therefore usually not feasible to perform ab-initio calculations of the energy, and one has to simplify the energy calculation by using empirical effective potentials for the interaction between the atoms instead. These potentials have to reflect correctly both the local and the long-range i n t e r a c t i o n a m o n g the atoms/ions: The atoms m u s t not be allowed to overlap too much, and, except via electrostatic forces, they m u s t not interact directly at long distances. Additional terms will enter t h a t reflect the local bonding situation of the atoms, and here usually a "selection" takes place, depending on whether we want to stress e.g. the covalent, ionic or metallic character of the hypothetical compound. Furthermore, we use simulation cells with periodic boundary conditions, in order to deal with the large n u m b e r of atoms in a solid. Since we are mainly interested in crystalline compounds, this choice is preferable over the open boundaries of a cluster. Note t h a t these periodic boundary conditions are not very restrictive, since size, shape and s y m m e t r y of the simulation cell can be freely varied during the m a n y global optimisation runs necessary to achieve a sufficient statistic. In addition to the cell parameters, a multitude of other p a r a m e t e r s can be varied during the optimisation, e.g. the location of the atoms, their degree ofionisation (if we are dealing with ionic systems), and even the n u m b e r of atoms a13 and the composition of the a13 Instead of varying the number of atoms arbitrarily during a given run, it has often proven to be more efficient to keep the composition fixed instead, and to repeat the optimisation with all the other promising compositions afterwards. Since this restriction makes the addition and removal of atoms rather awkward, fixing the composition usually implies fixing the number of atoms, too. Therefore, we also need to repeat the runs for a given composition with different numbers of formula units in the simulation cell. Our observations have shown that if e.g. the number of formula units is doubled, the same minima are still present, but that many additional minima appear, of course. As pointed out in the introduction, the number of possible metastable periodic structures appears to be limitless.
111 system within the cell. The initial configurations for the optimisations usually consist of a cell of ca. 10 times the volume of all the atoms t a k e n together, with the (neutral) atoms placed at random positions within the cell. While simple empirical potentials can produce m a n y promising structures, the precise values of e.g. their cell parameters or the exact locations of the atoms in the cell are not to be expected. Such details would require to perform another optimisation round for e.g. fixed composition, etc., using a more realistic potential. Preferably, one would want to employ either highly refined (semi)empirical potentials 28"32 o r some ab-initio method3335, a14 After the optimisation stage has been finished, the candidate structures with the lowest ground state energy will be analysed with respect to their stability and their local density of states using the methods described in section 3. The final
result of the whole procedure would now consist of an overview over a large number of preferred compositions, and possible structure types and structure elements that would be expected to occur under certain conditions in the investigated chemical system, together with estimates of the relative ground state energies, the kinetic stability and the local density of states of the most promising ones. The last information allows the tentative identification of thermodynamically stable and metastable structures.
4.2. Specific optimisation algorithms While it is a well-known fact t h a t one can devise more and more refined optimisation algorithms based on more and more detailed knowledge of the energy surface of the system, we have so far used only one family of algorithms, which are based on the stochastic simulated annealing algorithm 36 introduced by Kirkpatrick et al. 37 and Cerny 38. The great advantage of this method lies in the relative ease of implementation, the very general applicability independent of the specific optimisation problem, and the great freedom in the choice of a moveclass. Simulated annealing is based on the Metropolis (Monte-Carlo) algorithm 39, which implements a weighted random walk through configuration space. Starting from a current configuration 'T', a neighboring configuration "i+l" is chosen at random according to a set of rules (the "moveclass"). If the energy Ei+l is below or equal El, the move is always accepted, i.e., "i+l" becomes the new current configuration. Else, the move is only accepted with probability exp(-(Ei+l-Ei)/C), where C is a control p a r a m e t e r of the random walk. Thus during a sequence of such MCsteps, the system can climb over barriers of the energy surface. It can be shown that in the long-time-limit (t -~ oo) for an ergodic system the probability p(i) of visiting state ( = configuration) 'T' is given by the Boltzmann-distribution for the system at a t e m p e r a t u r e a15 given by T = C/kB. F u r t h e r m o r e , in analogy to the annealing procedure of a real material, we can reduce the control p a r a m e t e r C down to zero, and expect t h a t the system will end up, with a high probability, in However, it would be very time-consuming to perform a global optimisation, especially in the latter case. But, since usually many promising structure candidates have already been accumulated at this point, one would restrict oneself to essentially local minimisations for each structure type. a15 From this we can conclude that, in certain circumstances, we may interpret T as an actual temperature. a14
112 some deep-lying minimum. It can be shown that during such a simulated annealing run, an ergodic system will reach the global minimum 36, 40, 41, at least for t ~ oo. Clearly, the most important parameters of this optimisation procedure are the temperature program, i.e., the rule according to which C is decreased to zero, and the moveclass. While a considerable effort has been devoted to the developm e n t of efficient general t e m p e r a t u r e p r o g r a m s 4~ 42-46, the choice of a good moveclass still remains an open question, since it is highly problem dependent. For our explorations of the energy landscapes of solids, the best results seem to be gained from a moveclass where 60% - 80% of the moves are devoted to movements of single atoms, ca. 10% each to the change of ionic charges and the addition/removal/exchange of atoms, while the remaining ones should be used to adjust the size and shape of the simulation cell. However, the selection of the moveclass should also take the size of the system, the goals of the current optimisation, and the design of the temperature program into account. Often it might even be useful to change the moveclass as a function of temperature. As far as the t e m p e r a t u r e program is concerned, reasonable results have been achieved using Tn = TOfn (n = 0...nmax), with m MC-steps between temperature updates. Here, TO, f, nmax and m should be chosen according to the size of the system, the moveclass and the objective of the current optimisation run. The latter refers to the fact t h a t different strategies have to be chosen, if one wants to find as deep a m i n i m u m as possible for a given a m o u n t of computer time or whether one tries to gain a general overview over the statistical distribution of local minima and their accessibility a16 . We have employed two algorithms for the purpose of local optimisation: a stochastic quench, which corresponds to a simulated annealing algorithm with C = 0, and a steepest descent algorithm with a line search option 47. While the stochastic quench can be used with the full moveclass including charge transfer and atom exchange, the gradient descent is only useful when only the cell p a r a m e t e r s and the atom positions need to be adjusted.
4.3. Specific e m p i r i c a l p o t e n t i a l s The choice of the empirical potential for the global optimisation obviously depends very strongly on the system one investigates. So far, we have investigated two classes of systems, noble gases 4s and their mixtures 49, and binary 5~ and ternarySl, 54 ionic compounds, since relatively simple two-body potentials are already able to capture many qualitative and also some semi-quantitative aspects of the energy hypersurface of the compound in the derivation of effective energy functions for such test systems. Here we will only discuss the ionic systems.
a16 An extreme option consists of essentially replacing the global optimisation with many local optimisations, where it is hoped that the initial configurations are chosen such that all relevant regions of the configuration space are covered. This requires that either the barrier structure is simple enough that quench runs are not always stranded in high-lying minima even when starting from random initial configurations, or that the starting configurations are already so close to the best sub-optima that only some final adjustments are necessary to reach the desired minimum configurations. The latter case applies, if realistic initial states are constructed by some special algorithm, or if the initial configurations are based on the results of earlier optimisation runs.
113 The approximate potentials for the description of ionic systems consist of three terms, a screened Coulomb term, exp(-ar)/r, a repulsive term, (r -n, n = 12, usually), and an attractive dispersion term (r-6). If the damping factor in the Coulomb-term is not present, i.e., a = 0, the energy function is evaluated using the summation method suggested by deLeeuw55.:
Vij(rij) =
qiqjexp(-txrij) + 13~j -aiij 4xeorij ~rij ! ~rij I
(7).
In addition to these two-body terms, the energy function contains "one-body terms" E0(i) (the ionisation energy or the electron affinity, respectively), a term pv' when allowing volume changes, where p is the pressure and v' the volume per atom, and the chemical potential g(i), the latter being relevant, if the number of atoms is allowed to change during the optimisation. Thus, the energy function per atom we use takes the form E = 2 1 .~. Vij(rij) + 1 .~ Eo(i)(+ pv')(+ 1 .~ g(i)) l~j
1
(8).
1
Three major routes for determining the parameters in the empirical potentials are available, depending on the amount of pre-knowledge about the system. If the participating atoms have already been studied extensively in the context of molecular studies using ab-initio methods, it is sometimes possible to derive effective two- and three-body potentials based on these theoretical results 5658. Experience shows, however, that so far this path does involve considerable effort, taking months of investigation. Furthermore, the potentials cannot always be transferred with the expected accuracy from the molecular environment to the one in the solid state. Thus, one is usually forced to fit the parameters to some experimental observations, either to the properties of some specific known compound(s) or according to average "atomic" properties of the participating atoms, e.g. ionic radii and deformabilities of the ions. The first option encounters two basic problems, however. For one, if no compound of the system is known beforehand, obviously no fit can be performed. Secondly, if already a compound is available experimentally, one can very easily prejudice the model by using only this substance for fitting, and thus suppressing alternative minima of the (true) energy hypersurface. In contrast, the use of "atomic" properties averaged over many different compounds and local atomic environments should result in somewhat more unbiased effective potentials a17 . The disadvantage lies in the lack of specificity of the potential, of course. Because of this possible trade-off of accuracy vs. generality, one will choose the fit procedure according to the specific circumstances of the calculation. In both cases, but especially when using average atomic parameters, one should therefore repeat the optimisation runs varying the potential about the "best" values chosen originally a18 .
a17 Recently, the use of parameters averaged over many binary compounds has been proposed by Bush et al. 59 a18 For an in-depth discussion of the many aspects involved in choosing an empirical potential for the purpose of structure prediction using global optimisation methods, see ref. 51.
114 5. EXAMPLES Let us now discuss some representative and illuminating examples of structure prediction for the case of ionic compounds. Both binary and ternary compounds have been investigated; the latter including mostly unknown or at least not-yet-synthesized compounds. The general form of the potential was as given in section (4.3), with p = 0 and usually without chemical potentials. Both the case a = 0, making an Ewald-summation necessary, and the case a > 0 ( = screened Coulomb potential) have been used in constructing the effective potential. Average ionic radii, and information about the polarizability or "hardness" of the ions, were used to establish starting values of the free parameters in the potential - n, s and ~ij. First a large number of binary systems were investigated 5~ using long simulated annealing runs that would be expected to reach a very good sub-optim~lm of the energy hypersurface (> 106 MC-steps). For most of these runs, the composition of the system and the number of atoms within the simulation cell was kept fixed. Initially, electron transfer between randomly chosen pairs of atoms/ions was allowed. Of course, once the preferred ionic charges had been established as part of the global optimisation procedure, additional studies of the system using ions with fixed charges were admissible. The results are listed in table 1. We note that in nearly all instances the best sub-optimum found during the simulated annealing run turned out to be either the structure seen experimentally to be the preferred one or the structure that agreed with the radius-ratio-rule. Detailed Study of Binary Systems. While the above results clearly show that it is possible to determine very good sub-optima of the energy hypersurface of binary ionic compounds using effective potentials based on characteristic atomic quantities, it was felt necessary to investigate in more detail the energy hypersurface of some particular systems. We have chosen NaC1 as a representative ABsystem 5~ for AB2-systems MgF2, MgCI2, CaF2 and CaCI2.53 The parameters in the effective potential were varied slowly from each set of optimisation runs to the next, in order to study the robustness of the candidate structures. Each such set of runs consisted of 14 or 20 global optimisations of 105 to 3x105 simulated annealing steps, the only difference being the initial value of the random number generator. Thus, for each system, several hundred global optimisation runs were performed. Since it had already been established during several sets of optimisation runs where charge transfer was allowed that the best sub-optima occurred when the "fully" ionised ions were present (Na§ Mg2§ Ca2§ F-, CI-), the ionic charges were fixed at these values, and the moveclass consisted of movement/exchange of the ions, and variation of the size and shape of the simulation cell. Na/Cl. A large number of structure candidates corresponding to local minima of the energy hypersurface were found in this system. They have been listed in table 2. Two of these structures were judged to be of special interest, the "5-5"variant (figure 1), consisting of trigonal bipyramids of C1--ions around the Na+-ions, and a "NiAs"-type variant a19 . For each of these, the energy barriers separating them from other local minima were determined using the threshold algorithm.
Recently, Martin and Corbett have succeeded in synthesizing an unprecedented monohalide, LaI, exhibiting the NiAs-structure-type60.
a19
115 Table 1 B e s t r e s u l t s of s i m u l a t e d a n n e a l i n g r u n s including a c o m p a r i s o n w i t h t h e observed s t r u c t u r e s . D u r i n g t h e optimisation, m a n y a d d i t i o n a l s t r u c t u r e s , c o r r e s p o n d i n g to local m i n i m a of t h e e n e r g y h y p e r s u r f a c e , h a v e b e e n f o u n d in t h e s e systems. Input
S t r u c t u r e (sim.ann.)
S t r u c t u r e (observed)
Na- C1
NaCl-structure
NaCl-structure
Cs- C1
CsCl-structure
CsCl-structure
Li- F
NaCl-structure
NaCl-structure
Na- F
NaCl-structure
NaCl-structure
K- F
NaC1/CsCl-structure
NaCl-structure
Rb - F
NaCl/CsCl-structure
NaCl-structure
Cs- F
NaC1/CsCl-structure
NaCl-structure
Ba- O
NaCl-structure
NaCl-structure
Li- I
NaCl-structure
NaCl-structure
Sr- O
NaCl-structure
NaCl-structure
Ni - O
NaCl-structure
NaCl-structure
Ca- O
NaCl-structure
NaCl-structure
Mg- O
NaCl-structure
NaCl-structure
Ca- F
fluorite-structure
fluorite-structure
Mg- F
rutile-structure
rutile-structure
Li - O
antifluorite-structure
antifluorite-structure
K- O
antifluorite-structure
antifluorite-structure
Sr- Cl
fluorite-structure
fluorite-structure
Ca- C1
CaC12-structure
CaCl2-structure
Sn - O
rutile-structure
rutile-structure
Ti- O
rutile-structure
rutile-structure
Na- O
antifluorite-structure
antifluorite-structure
Si- O
cristobalite-, t r i d y m i t e - s t r u c t u r e
e.g. quartz-, tridymite-, cri s to bali te- s t r u c t u r e
While t h e NiAs-type s t r u c t u r e exhibited a b a r r i e r in excess of 0.13 eV/atom (T = 1300 K), t h e e n e r g y b a r r i e r for the "5-5"-structure w a s f o u n d to be 0.02 eV/atom (T = 200 K). T h u s , t h e l a t t e r one would not be expected to survive a t h i g h t e m p e r a t u r e s , unless it w e r e stabilized in some fashion a2o . A n o t h e r point to note is the high a20 Here, one should note that for a set of optimisation runs, where large amounts of excess chlorine atoms were present, the "5-5"-type structure with the excess chlorine located within the channels occurred as the major stable alternative to the phase separation into one region filled with neutral chlorine and a second one containing the standard NaCl-structure. Thus, one might expect that in the presence of some third component that could fill the channels in the "5-5"structure, it might be stabilized.
ll6 Table 2 Overview over the m i n i m a found for the s y s t e m Na - Cl with composition 1:1. The t e r m s "dense/open s t r u c t u r e s " refer to a r r a n g e m e n t s of different coordination polyhedra without/with large channels or cavities in the structure. coordination n u m b e r
structure-types/
occurence (in %)
of Na by C1
structure-elements
4
anti-PtS-structure, sphalerite
5
edge-connected trigonal bipyramids
5
edge-connected square pyramids
3.0
6
NaCl-structure
37.2
6
NiAs-structure
20.0
6
anti-NiAs-structure
4.3
6
other structures
1.8
7
monocapped p r i s m s
0.3
8
CsCl-structure
0.1
mixed
"dense" structures
12.0
mixed
"open" s t r u c t u r e s
9.0
5.5
edge-connected t e t r a h e d r a 6.8
robustness of the deepest local _mimma with respect to quite large variations of the p a r a m e t e r s in the potentialS~ a21 In order to gain some i n s i g h t with respect to general aspects of the global optimisation for large pressures, we have considered p r e s s u r e s of 1 GPa, 10 GPa, 100 GPa a n d 103 GPa by adding a t e r m pv' to the energy function a22 , where p is the pressure, and v' the volume per atom: H = E + pv'. By m i n i m i s i n g the e n t h a l p y H, we have found t h a t for pressures up to 10 GPa, the NaCl-structure was the deepest m i n i m u m , while above 100 GPa seven-fold and eight-fold (CsCl-structure) coordinations of Na by C1 are preferred compared to the six-fold ones. This agrees qualitatively w i t h the e x p e r i m e n t a l observation t h a t the h i g h - p r e s s u r e modification of NaC1 probably shows a CsCl-structure above 29 GPa 61. AB2 Systems. Since l a y e r e d s t r u c t u r e s are i m p o r t a n t v a r i a n t s in AB2 systems (e.g. the CdI2 a n d CdC12 families of structures), it was d e e m e d n e c e s s a r y to always use a Coulomb potential without a d a m p i n g term. The method chosen to perform the full Coulomb s u m was the one suggested by de Leeuw 5s. Again, the resulting structure candidates in general did not depend on the exact values of the effective potential. The m o s t common s t r u c t u r e types found are s u m m a r i z e d in table 3. We note t h a t these include m a n y commonly found s t r u c t u r e s like a n a t a s , a21 Additional tests, whether there were differences between using a damping factor or an Ewald-type summation for the electrostatic terms, showed that the distribution of local minima encountered during the optimisations did not change significantly when using a damping factor (a > 0). a22 Since we are searching for local minima of the potential energy at T = 0 K, additional terms proportional to kBT In(V) that need to be considered in simulations at finite temperature vanish. Analogously, the influence of external electric or magnetic fields can be included 22.
117
Figure 1: 5-5-structure candidate for NaCl: Na ( b l a c k ) a n d C1 (white) are coordinated bipyramidally. Ions are not drawn to scale. Average Na-Cl-distance 2.7
A. fluorite, but also layer structures like CdI2. Although the energies of the latter structure candidates usually are higher than the best sub-optima, it is a pleasant surprise that these layer structures were competitive even in a purely ionic picture. Using the threshold algorithm (cf. section 3), we computed the barrier structure of the system MgF2 (z = 2) on the tree graph level 27 (figure 2). We see that the second deepest local minimum, the anatas structure, is separated by quite high barriers from the global minimum, the rutile structure. Thus, one would suspect that MgF2 in the anatas structure might well turn out to be kinetically stable, while e.g for the structure "VII", exhibiting monocapped F--prisms about the Mg2+-ions, the energy barrier separating it from the rutile-minimum is so small that '~II" will not be expected to exist for finite temperatures. AB3 Systems. Finally, as an example for AB3 systemsS! ' 52 the energy landscape of the alkalimetal-nitrides Li3N, Na3N, K3N, Rb3N and Cs3N were explored. During the initial (long) optimisation runs with charge transfer allowed, it was found t h a t in the first two cases the full oxidation states (Li +, Na +, N 3-) were reached easily, while the latter three did not succeed in completely crossing the barriers on the energy hypersurface separating the "fully" ionised from the "partly" ionised configurations. Both Li3N and Na3N exhibited several interesting candidate structures as good sub-optima: the Li3N- and the Li3P-structure known to occur in nature, and a sheared "SrO3"-sub-structure of the SrTiO3-perovskite structure, both in the cubic and hexagonal variant, shown in figure 3. Based on these results, it appears reasonable to suggest that Na3N, if it can be synthesized eventually, will crystallize in one of these four structure types. Subsequent energy calculations for these structures using the Hartree-Fock-type ab-initio program CRYSTAL92 s4 showed t h a t their energies were quite close to each other. Concerning the kinetic stability, calculations with the threshold algorithm showed
118
AL
0
I CN=4 layered
structures I
CN = 4,5 I -5.400
_ CN = 5,6
-5.600
_
-5.800
_
Vl-d
"6.000
-6.200
V14 VII
_
Vl-a
VI-c Vl-b
-6.400
Figure 2: Treegraph of the barrier structure for the MgF2- system (z = 2). Circled regions indicate regions containing m a n y local minima (energies in eV/amm): VI-a = rutile (E = -5.35), VI-b = a n a t a s (E =- 5.20), VI-c = half-filled NaCl-structure (E = -5.03), VI-d = CdI2-structure (E =-5.93), VI-e = structure consisting of trigonal prisms (E =- 5.93), VII = structures consisting of mono-capped prisms (E =-6.12). The shaded region contains several r a t h e r deep minima with structures consisting of trigonal bipyramids (n = 5), quadratic pyramids (n = 5) and prisms (n = 6) (always MgFn-coordination polyhedra). t h a t the two "SrO3"-sub-structures were separated from each other by a r a t h e r small barrier (= 0.1 r while the other barriers all exceeded 1 r
119 Table 3 Overview over the most i m p o r t a n t structure-types/elements seen in the systems MgF2, CaF2, MgC12 and CaC12. The polyhedra listed always refer to the coordination of the cation by the anions. The frequency of occurrence during optimisation runs is given in percent. MgF2
MgC12
CaF2
CaC12
t e t r a h e d r a (e.g., HgI2-structure, ZnO2-
0
31
3
4
sub-structure of SrZnO2) trigonal bipyramids connected via
6
1
0
0
20 37
7 45
5 9
16 0
structure-elements/ structure-types
edges and corners CdC12/CdI2-structure rutile anatas
2
0
0
0
CaC12-structure
0
0
0
44
other structures based on octahedra
7
10
0
10
prisms connected via
3
0
0
2
edges and corners monocapped prisms connected via
21
0
15
22
edges and corners fluorite
0
0
66
0
other structures
4
6
2
2
Additional optimisations were performed using K+, Rb+, Cs+, and N3--ions for the initial configuration. The observed structures ranged from the ReO3-structure for Cs3N and a cube-like variant of the Li3N-structure for Rb3N and Cs3N to elevenfold (so-called "tetragonal close packing") and twelve-fold (like Na3N) coordinations of the N3--ions by the cations in K3N. One should note the decrease in the coordination n u m b e r with increasing size of the cation, reminiscent of the radius ratio rule. Ternary Ionic Systems: Ca/Ti/O. One of the examples of t e r n a r y ionic compounds 51 was the system Ca-Ti-O. Of course, there exists a large range of compositions t h a t allow for "full" ionisation (Ca2+, Ti4+, 0 2-) of the system: CanTimOo with p = n + 2m. While in binary compounds the composition corresponding to the "fully" ionised case is established nearly automatically during a general global optimisation run, there exist now a number of equally valid options t h a t correspond to deep local minima separated by rather high energy barriers. For the test runs, the three cases n = 2 and rn = 1, n = 1 and m = 2, and n = 3 and m = 1 were chosen besides the simplest one, n = 1 and m = 1, i.e. CaTiO3. The potential depended only on the average ionic radii, with the assumption t h a t the ions were of medium hardness, and the Coulomb-potential contained a damping factor. In half of the relatively short (2x105 steps) runs for the case (n = 1,m = 1), the best sub-optima found were perovskite-type structures, while in only 20% of the
120
Figure 3a: Sheared SrO3-substructure of the hexagonal perovskite structure as model for the hypothetical compound Na3N. N3--ions are surrounded by Na+-ions forming cubic close packed distorted anti-cube-octahedra. Na+-ions are depicted as black spheres (not drawn to scale). Average Na-Na- and Na-N-distance about 2.6/I~; (space group PmmnZ, no. 59).
Figure 3b: Sheared SrO3-substructure of the cubic perovskite structure as model for the hypothetical compound Na3N. N3--ions are surrounded by Na+-ions forming cubic close packed distorted cube-octahedra. Na§ are depicted as white spheres (not drawn to scale). Average Na-Na- and Na-N-distance about 2.6/~; (space group P42/mmc, no. 131).
121 runs the system ended in a configuration t h a t was not "fully" ionised (Ca2+Ti3+O(02-)2). This agrees well with the experimentally observed 5 structure of CaTiO3, which is considered to be a distorted perovskite-structure. In addition, we estimated the heat of formation at T = 0 K with respect to educts CaO and TiO2, which was found to be about 130 kJ/mol. Considering the highly simplified model for the potential energy, this compares reasonably well a23 with the e x p e r i m e n t a l l y found value 62, 90 IO/mol at standard conditions. For the other three compositions we have investigated, the best sub-optimal structures were "fully" ionised, too. However, so far only some "layered" structures of Ca2TiO4, related to the structure 63 of Ca2SnO4, were found to be t h e r m o d y n a m i c a l l y stable (about 40 kJ/mol) with respect to dissociation into CaO and TiO2. Ca/Si/Br. As a final example, we will discuss the determination of structure candidates for an unexplored t e r n a r y system, Ca/Si/Br appearing to be well-suited for this purpose: F r o m the participating components, one would conclude t h a t a hypothetical t e r n a r y compound of e.g. the composition Ca3SiBr2 should be close enough to an ionic compound (Ca2+, Si4-, Br) in order to allow the use of the simple potential function we have been testing so far. F u r t h e r m o r e , both b i n a r y compounds (Ca2Si, CaBr2) are known, and it is therefore possible to fit the p a r a m e t e r s in the two-body potentials to their properties. Last, but not least, the system has not yet been subject to intensive a t t e m p t s to synthesize such a t e r n a r y compound, leading to the expectation of eventual experimental verification of the predictions. During the testruns needed for the fitting of the parameters, the binary (ionic) compounds proved to be stable local minima. Therefore, we did not precede the long global optimisation r u n s by optimisations involving charge transfer. Instead, the ionic charges were fixed from the outset; and these ions were then, as usual, placed at r a n d o m positions within a large simulation cell. We have concentrated 54 on the composition Ca3SiBr2. Each simulated annealing r u n involved several million optimisation steps, where up to four formula units ( = 24 ions) per simulation cell were used. The structure candidates belonging to two of the best sub-optima found are shown in figures 4a and 4b. The structure in figure 4a can be derived from a NaC1type structure, where the "Na"-sublattice is occupied by the Ca 2§ ions, while the "cr'-sublattice contains the Si4- and Br ions in such a way t h a t the Si4- ions are far away from each other while still allowing for a relaxation of the inevitable distortions due to the somewhat different sizes and hardnesses of the spheres representing Si4- and Br- ions. In figure 4b, an alternative structure is shown, derived from the CsCl-type. Again, the anions occupy the Cl-sublattice, while the cations are located on the sites of the Cs-sublattice. Since no obvious way exists to judge the quality of these structure candidates (at least until the compound has been synthesized!), the ground state energies of these sub-optimal structures were also calculated using the Hartree-Fock-type program CRYSTAL9234 (no global optimisation runs, of course). It turned out t h a t these calculations could not decide either, which structure candidate should actually be preferred: EHF(NaCI-typc) - EHF(CsC1type) = -0.04 eV/atom compared to E(NaCl-typc) - E(CsCl-typc) = -0.04 eV/atom for the simple potentials. a23 Note that our energy of formation with respect to the binary compounds is calculated at T = 0 K, p = 0 Pa, while the experimental value is valid at standard conditions. Since the resulting difference in enthalpy is basically given by the integral over the specific heat, and the contribution from the binary compounds will to a large degree cancel the contribution from the ternary compound, the rough comparison we perform appears to be reasonable.
122
Figure 4a: Structure candidate for Ca3SiBr2 analogous to the NaCl-structure. Si4-ions lie in the center of shaded octahedra formed by Ca2+-ions, Br-ions within the white ones.
Figure 4b: Structure candidate for Ca3SiBr2 analogous to the CsCl-structure. Si4-ions lie in the center of shaded cubes formed by Ca2+-ions, Br-ions within the white ones.
123 6. C O N N E C T I O N S TO E A R L I E R S T U D I E S OF T H E E N E R G Y S U R F A C E OF C O M P L E X SYSTEMS In the field of solid state theory, there exists a long tradition of attempts e4 to understand the structure of solids, be they crystalline or amorphous, of macroscopic size or microscopic like clusters. It has always been clear that (meta)stable structures of solids at low temperatures can be identified by determining the minima of the energy hypersurface of the chemical system (as function of atomic coordinates), with the exception of certain systems dominated by quantum effects. Especially in the early days of solid state physics 65, 66, much effort was devoted to the calculation of cohesive (= ground state) energies of already known compounds 17, and the derivation of elastic constants by variation of the lattice parameters. With the development of computers and fast algorithms, new tools became available to solid state theory. From the molecular dynamics studies of Alder and Wainwright 67 modelling hard sphere liquids to the ab-initio molecular dynamics of the Car-Parrinello algorithm 35 able to perform local optimisations of the structure of simple solids 6s runs a clear path of steady improvement. This work has been paralleled by developments in the theory of molecules and their structure 69. It is by now standard procedure to employ some local optimisation routine during ab initio studies of molecules. A joining of these two strands of research has come about in the investigation of clusters 7~ which can often be viewed either as large molecules or as nanoscopic solids. While the clusters studied so far are often too small to really justify being viewed as extended solids, the work on such mesoscopic systems has had a profound effect on the way one approaches the analogous problem of structure prediction in solids as discussed in this chapter. For two facts in common to dusters and solids have been brought home very strongly 73, 84-86 : the number of possible structures of a cluster/solid characterized by being local minima of the energy hypersurface is very large, and it is not clear at all how to decide on the "best" or "typical" one without studying the energy landscape in great detail. This involves both the calculation of the depth of the local minima, the study of the density of states associated with these minima, and finally the determination of the barrier heights of the saddle points connecting the different structures a24 . While the importance of such saddle points has always been appreciated in the study of molecular reactions 87-91, this has been perhaps less so when dealing with solids, where usually only one structure has been considered at a time. The justification for this traditional procedure lies in the kinetic stability of solids, i.e., the thermodynamic behaviour of a solid may depend only on the small region of the energy landscape around some (meta)stable structure (cf. section 2). It has therefore been perfectly reasonable to concentrate on the local environment of some already known or expected interesting structure, and to perform local optimisations of some energy function. Such local optimisations 35, 68, 92, 93 have been performed with varying degrees of restrictions on the number of atoms, symmetry of the structure, and size of the cell parameters, a25 But if one tries to predict hypothe-
a24 Thus we would expect that e.g. the lid/threshold algorithms will also prove useful in the study of the energy landscape of clusters. a25 In this context, one should also mention the modelling of proteins94 and of amorphous structures56, 95. In the latter case, one aims for a disordered arrangement of the atoms: thus
124 tical compounds without any a-priori information, the whole energy landscape needs to be explored requiring the use of both global optimisation algorithms and methods like the threshold algorithm. An interesting application of the use of global optimisation methods in the structure determination of solids from powder diffraction data - as opposed to the apriori structure prediction described in this chapter - has been introduced in recent years by Pannetier, Newsam, Freeman, Catlow and others 31, 96-101. In their work, it is generally assumed that some experimental information about the compound is already available, e.g. as a X-ray powder diffractogram. Thus, the size and shape of the unit cell, the composition of the compound, the ionic charges, and the number of formula units present in the cell are assumed to be known. It "only" remains to determine the exact positions of the atoms within this cell, a task that often tends to be quite involved, especially if only data from X-ray or neutron powder diffraction experiments are available. Imposing the above constraints, the authors perform global or local minimisations of special cost functions that combine some energy terms based on effective potentials with penalty terms reflecting e.g. Pauling's bond-valence rule 98 or the requirement of an even distribution of ions within the unit cell 97. If the penalty terms and the effective potential are chosen appropriately a26 , the minimisation produces a number of reasonable structure candidates, which then can serve as input either of a structure solving program or a refinement optimisation using a more realistic energy function. In the extreme case, the cost function is solely constructed from penalty terms and equals a "figure of merit" that reflects a-priori knowledge of the typical bonding arrangements in the solid 1~
REFERENCES 1E. J. Corey, Angew. Chem., 103, 469, (1991) 2I. Ugi, J. Bauer, K. Bley, A. Dengler, A. Dietz, E. Fontain, B. Gruber, R. Herges, M. Knauer, K. Reitsam and N. Stein, Angew. Chem., 105, 210, (1993) 3G. Ciccotti, D. Frenkel and I. R. McDonald, Simulation of Liquids and Solids, (North-Holland, Amsterdam, 1987), 4V. M. Goldschmidt, Skrift. Nors. Videns.-Akad. Oslo, I (Mat.-Naturv. KI.), (1926) 5A. R. West, Solid State Chemistry and Its Applications, (Wiley & Sons, New York, 1984) 6L. Pauling, The Nature of the Chemical Bond, (Cornell Univ. Press, Ithaca, 1960) 7W. Hume-Rothery, J. Inst. Metals, 35, 295, (1926) 8U. Mfdler, Anorganische Strukturchemie, (Teubner, Stuttgart, 1992) 9I. D. Brown and R. D. Shannon, Acta Cryst. A, 29, 266, (1973) I~ Hoppe, Adv. Fluor. Chem., 6, 387, (1970) instead of finding a good sub-optimum using some global optimisation method, one performs a local minimisation resulting in an amorphous structure. a26 Since these penalty terms do not correspond to a physical energy term, but only reflect some "intuitive" chemical or physical knowledge about the system, they have to be treated very carefully when being assigned a quantitative meaning compared to terms from an effective potential.
125 llN. Engel, Acta Cryst. B, 47, 849, (1991) 12I. D. Brown and R. Duhlev, J. Solid State Chem., 95, 51, (1991) 13R. Duhlev, I. D. Brown and C. Balarew, J. Solid State Chem., 95, 39, (1991) 14I. D. Brown, Z. Krist., 185, 503, (1988) 15I. D. Brown, Acta Cryst. B, 48, 553, (1992) 16j. Callaway, Quantum Theory of the Solid State, (Academic Press, New York, 1974) 17N. W. Ashcroi~ and N. D. Mermin, Solid State Physics, (Harcourt Brace College, New York, 1976) 18j. K. Burdett and S. Lee, J. Am. Chem. Soc., 105, 1079, (1983) 19E. Canadell and M.-H. Whangbo, Chem. Rev., 91, 965, (1991) 20p. Sibani, J. C. Sch0n, P. Salamon and J.-O. Andersson, Europhys. Lett., 22, 479-485, (1993) 21j. C. Sch0n, H. Putz and M. Jansen, J. Phys. Cond. Matter, 8, 143, (1996) 22L. D. Landau and E. M. Lifshitz, Statistical Physics 3rd ed. Part 1, (Pergamon Press, New York, 1985) 23R. G. Palmer, Adv. Phys., 31,669, (1982) 24p. Kollman, Chem. Rev., 98, 2395, (1993) 25j. E. Hunter III, W. P. Reinhardt and T. F. Davis, J. Chem. Phys., 99, 6856, (1993) 26j. C. Sch6n, J. Chem. Phys., 105, 10072, (1996) 27j. C. Sch5n, Ber. Bunsenges. Phys. Chem., 100, 1388, (1996) 28S. M. Foiles, M. I. Baskes and M. S. Daw, Phys. Rev. B, 33, 7983, (1986) 29A. M. Stoneham, Handbook of Interatomic Potentials. L Ionic Crystals, (preprint, 1981) 3~ P. Tosi and F. G. Fumi, J. Phys. Chem. Sol., 25, 45, (1964) 31C. R. A. Catlow, R. G. Bell and J. D. Gale, J. Mat. Chem., 4, 781, (1994) 32M. Finnis, Acta Met. Mater., 40, $25, (1992) 33N. Chetty, K. Stokbro, K. W. Jacobsen and J. K. Ncrskov, Phys. Rev. B, 46, 3798, (1992) 34C. Pisani, R. Dovesi and C. Roetti, Hartree-Fock ab-initio treatment of crystaUine systems, (Springer Verlag, Heidelberg, 1988) 35R. Car and M. Parrinello, Phys. Rev. Lett., 55, 2471, (1985) 36p. j. M. van Laarhoven and E.H.L.Aarts, Simulated Annealing, (D. Reidel Publishing Company, Dordrecht, Holland, 1987) 37S. Kirkpatrick, C. D. Gelatt, Jr. and M. P. Vecchi, Science, 220, 671, (1983) 38V. Cerny, J. Opt. Theory Appl., 45, 41, (1985) 39N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, J. Chem. Phys, 21, 1087, (1953) 4~ Geman and D. Geman, IEEE, PAMI, 6, 721, (1984) 41p. j. M. van Laarhoven, De Erasmus Universiteit Rotterdam, Ph.D. Thesis, (1988)
126 42B. Andresen, K. H. Hoffmann, I~ Mosegaard, J. Nulton, J. M. Pedersen and P. Salamon, Journal de Physique (France), 49, 1485, (1988) 43L. Goldstein, Mean Square Rates of Convergence in the Continuous Time Simulated Annealing Algorithm on Rd, (preprint, 1985) 44B. Hajek, Math. Oper. R., 13, 311, (1988) 45G. Ruppeiner, J. M. Pedersen and P. Salamon, J. Physique I, 1, 455, (1991) 46p. Salamon, J. Nulton, J. Robinson, J. Pedersen, G. Ruppeiner and L. Liao, Comput. Phys. Commun., 49, 423, (1988) 47D. A. Pierre, Optimization Theory with Applications, (Dover Publications, New York, 1986) 48j. C. SchSn and M. Jansen, Ber. Bunsenges., 98, 1541, (1994) 49H. Putz, J. C. SchSn and M. Jansen, Ber. Bunsenges., 99, 1148, (1995) 50j. C. Sch~in and M. Jansen, Comp. Mat. Sci., 4, 43, (1995) 51j. C. Sch~in and M. Jansen, Angew. Chem. (Int. Ed.), 108 (35), 1358 (12861304), (1996) 52j. C. SchSn, GIT Fachz. Lab., in press, (1996) 53M. Wevers, Univ. Bonn, Diplom Thesis, (1995) 54H. Putz, J. C. SchSn and M. Jansen, in prep., (1997) 55S. W. de Leeuw, J. W. Pertain and E. R. Smith, Proc. Roy. Soc. London, Set. A, 373, 27, (1980) 56F. H. Stillinger and T. A. Weber, J. Phys. Chem., 91, 4899, (1987) 57C. Oligschleger and H. R. Schober, Physica A, 201,391, (1993) 58J.-R. Hill and J. Sauer, J. Phys. Chem., 98, 1238, (1994) 59T. S. Bush, J. D. Gale, C. R. A. Catlow and P. D. Battle, J. Mat. Chem., 4, 831, (1994) 6oj. D. Martin and J. D. Corbett, Angew. Chem., 107, 234, (1995) 61W. F. Sherman and A. A. Stadtmuller, Experimental Techniques in HighPressure Research, (Wiley & Sons, New York, 1987), 308 62I. Barin, Thermochemical Data of Pure Substances, (VCH, New York, 1989) 63M. Troemel, Z. anorg, allg. Chemie, 371, 237, (1969) 64M. L. Cohen, Physica Scripta, T1, 5, (1982) 65M. P. Tosi, in Solid State Physics vol 16, edited by F. Seitz and D. Turnbull (Academic Press, New York, 1964) 66j. E. Jones and A. E. Ingham, Proc. Roy. Soc. A, 107, 636, (1925) 67B. J. Alder and T. E. Wainwright, J. Chem. Phys., 27, 1208, (1957) 68R. M. Wentzcovitch and J. L. Martins, Sol. Star. Comm., 78, 831, (1991) 69E. Clementi (Ed.), Modern Techniques in Computational chemistry: MOTECC-90, (ESCOM, Leiden, 1992) 7~ Ochsenfeld and R. Ahlrichs, Ber. Bunsenges., 98, 34, (1994) 71C. Ochsenfeld and R. Ahlrichs, J. Chem. Phys., 97, 3487, (1992) 72R. S. Berry, P. A. Braier, R. J. Hinde and H.-P. Cheng, Israel J. Chem., 30, 39, (1990)
127 73R. S. Berry, Chem. Rev., 93, 2379, (1993) 74V. Bonacic-Koutecky, P. Fantucci and J. Koutecky, Chem. Rev., 91, 1035, (1991) 75W. Andreoni, in The Chemical Physics of Atomic and Molecular Clusters, edited by G. Scoles (North-Holland, Amsterdam, 1990), 159 76E. Blaisten-Barojas and D. Levesque, Phys. Rev. B, 34, 3910, (1986) 77A. W. Castleman, Jr. and R. G. Keesee, Chem. Rev., 86, 589, (1986) 78W. Damgaard Kristensen, E. J. Jensen and R. M. J. Cotterill, J. Chem Phys., 60, 4161, (1974) 79U. Even, N. Ben-Horin and J. Jortner, Phys. Rev. Lett., 62, 140, (1989) 8~ R. Hoare, Adv. Chem. Phys., 40, 49, (1979) SiR. O. Jones and G. Seifert, J. Chem. Phys., 96, 7564, (1992) 82j. p. Rose and R. S. Berry, J. Chem. Phys., 98, 3246, (1993) 83D. J. Wales, Mol. Phys., 78, 151, (1993) 84R. S. Berry, J. Phys. Chem., 98, 6910, (1994) 85R. E. Kunz and R. S. Berry, Phys. Rev. Lett., 71, 3987, (1993) 86R. E. Kunz and R. S. Berry, Phys. Rev. E, 49, 1895-, (1994) 87B. Jeziorski, R. Moszynski and K. Szalewicz, Chem Rev., 94, 1887, (1994) 88M. D. Newton, Chem. Rev., 91,767, (1991) 89j. E. Eksterowicz and K. N. Houk, Chem. Rev., 93, 2439, (1993) 9~ Page and J. W. McIver, Jr., J. Chem. Phys., 88, 922, (1988) 91j. Pancik, Coll. Czech. Chem. Comm., 40, 1112, (1975) 92C. R. A. Catlow and A. N. Cormack, Int. Rev. Phys. Chem., 6, 227, (1987) 93M. P. Teter, Int. J. Quant. Chem.: Quant. Chem Symp., 27, 155, (1993) 94M. Wagener and J. Gasteiger, Angew. Chem., 106, 1245, (1994) 95C. Oligschleger, RWTH Aachen, Doct. Diss.. Thesis, (1994) %C. M. Freeman and C. R. A. Catlow, J. Chem. Soc., Chem. Comm. 1992, 89, (1992) 97C. M. Freeman, J. M. Newsam, S. M. Levine and C. R. A. Catlow, J. Mater. Chem., 3, 531, (1993) 98j. Pannetier, J. Bassas-Alsina, J. Rodriguez-Carvajal and V. Caignaert, Nature, 346, 343, (1990) 99p. A. Wright, S. Natarajan, J. M. Thomas, R. G. Bell, P. L. Gai-Boyes, R. H. Jones and J. Chen, Angew. Chem. Int. Ed. Engl., 31, 1472, (1992) I~176K. Belashenko, Inorg. Mat., 30, 966, (1994) I~ S. Bush, C. R. A. Catlow and P. D. Battle, J. Mater. Chem., 5, 1269, (1995) l~ W. Deem and J. M. Newsam, J. Am. Chem. Soc., 114, 7189, (1992) l~ W. Deem and J. M. Newsam, Nature, 342, 260, (1989)
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
129
Polarizability and Hyperpolarizability of Atoms and Ions David M. Bishop Department of Chemistry, University of Ottawa, Ottawa, Canada K1N 6N5
In 1927 Pauling published his ground-breaking paper: "The Theoretical Prediction of the Physical Properties of Many-Electron Atoms and Ions: Mole Refraction, Diamagnetic Susceptibility, and Extension in Space." The historical setting of this work is recounted and the flavour of the early days of the 'new' quantum mechanics is recalled. Pauling's key ideas concerning the calculation of mole refraction, a quantity which is directly related to a species' polarizability, are analyzed. More recent developments in the theory of polarizabilities are examined and their extensions to higherorder polarizabilities, both static and frequency-dependent (dynamic), are given. Particular attention is paid to the hydrogen atom and the helium, neon, and argon isoelectronic series, all of which were treated in Pauling's original paper.
1. Historic Times Linus Pauling arrived in Munich on April 20, 1926[ 1]. This was very much a case of being in the right place at the right time. He was going to spend one year at the Institute of Theoretical Physics which at the time was directed by Arnold Sommerfeld. For this purpose he had been awarded one of the very first fellowships of the John Simon Guggenheim Memorial Foundation. It was the right place because Sommerfeld was a bridge between the 'old' and the 'new' quantum mechanics: he had spent much time trying, only with partial success, to patch up Bohr's old quantum theory. His Institute was one of the nerve centres in Germany for the development of the revolutionary ideas of Heisenberg (a former student of Sommerfeld) and Schr6dinger (whose seminal paper was published in 1926). Prepublication documents circulated throughout the Institute and, since things were moving so rapidly, this was of great importance. Sommeffeld did not contribute much to the new theory but was a strong supporter and a guiding influence on many young scientists at the time; Heisenberg once said "I learned mathematics from Born and physics from Bohr. And from Sommerfeld I learned optimism." Max Born in G6ttingen and Niels Bohr in Copenhagen directed two of the other 'nerve centres' of the epoch. How Pauling came to be in Munich in 1926 is an interesting story not without its element of skullduggery. He had left from the California Institute of Technology, better known as Cal Tech, where he had received his Ph.D. with summa cum laude in June of 1925. When he had begun his Ph.D. in the fall of 1922 Cal Tech, as such, was only four years old; it had previously been a small manual training college called Throop College. With a lot of money and a lot of talent, it was already on its way to becoming a major
130 scientific institution. One of the talents was the Chairman of the Chemistry Department, Arthur A. Noyes, another was the Chairman of the Physics Department, Robert A. Millikan. It was Noyes who assigned Pauling to Roscoe G. Dickinson to carry out x-ray crystallographic studies for his thesis. Dickinson, himself, was the first person to receive a Ph.D. in chemistry from Cal Tech. As well as doing this, Pauling also worked for a while with Peter Debye, who was a visiting professor. In fact, Cal Tech was no backwater; other visitors, while Pauling was there, were Born, Sommerfeld, Ehrenfest, Franck, Bohr, and Einstein. On top of this, the faculty boasted the two renowned theoreticians Carl Eckart and Paul S. Epstein, whose name will re-occur shortly, as well as the great physical chemist Richard C. Tolman. It was Tolman who was to introduce Pauling to quantum theory. Had Pauling not gone to Europe he still would have been in a young and thriving intellectual environment. The reason he did go was the doing of Noyes. Pauling had shown early brilliance and was now what Variety magazine would call a hot property. He had been given a National Research Council grant to do post-doctoral studies at Berkeley where Gilbert N. Lewis, another giant of the day, was the head of the chemistry department. Noyes was determined not to lose Pauling to Lewis and persuaded Pauling to stay on for a few more months, which Pauling did. Noyes used this breathing space to good effect. He practically 'arranged' that one of the first Guggenheim Fellowships, given for study abroad, should go to Pauling. There was only one snag: the Guggenheim would not be approved until April 1926, consequently, Noyes offered Pauling Cal Tech funds for his travel and living expenses until that time came. Noyes was smart enough to realize that this was an offer Pauling could not refuse. This was the time when great things were happening, and happening fast, in Europe. Pauling did not refuse, he relinquished his NRC scholarship and set out for Germany. For Noyes, Europe was a safe distance from Berkeley and his scheming paid o f f - in the fall of 1927 Pauling returned to Cal Tech and stayed there for thirty more years. One of the first things Pauling did in Munich was to discover a mathematical error in a paper by Sommerfeld's assistant Gregor Wentzel. Pauling wrote this up for Zeitschriftfar Physik[2]. Sommerfeld was sufficiently impressed to say that he would submit Pauling's next paper to the Proceedings of the Royal Society (London). Sommerfeld had just recently been appointed a foreign member of this most prestigious body and articles in their Proceedings could only be submitted by a member. Pauling's 'next paper' was the one which is the focus of this chapter[3]. It was received in London on New Year's Day 1927. In telling this story it is apparent how much was new, flesh and young in 1926: Pauling, Cal Tech, Roscoe, the Guggenheim Fellowships, quantum theory, even Arnold Sommerfeld as far as the Royal Society was concerned!
131
2. The Paper
"The Theoretical Prediction of the Physical Properties of Many-Electron Atoms and Ions: Mole Refraction, Diamagnetic Susceptibility, and Extension in Space" has all the Pauling hallmarks: quick off the mark, broad-sweeping in range, intuitive approximations, and so nearly right. We will be concerned only with the 'mole refraction' part of his article. The mole refraction (R) is related to the refractive index (n) by R =
n 2 -1 n2+lV.,,,.
(1)
= P~176
(2)
where Vmis the molar volume. Since n 2 -1 n2+2
3eo
where po is the number density, eo is the permittivity of vacuum and ~(co) is the dynamic dipole polarizability for light of frequency co, we can write R
=
4Jr N AO~(0)) 3
4~re0
(3)
if the SI convention is used. There is thus a direct proportionality between R and the polarizability c~; usually the frequency dependence is small and o~(o~) is assumed to be the same as the static quantity, o~, defined for co ---) ~,. For this reason a discussion of the mole refraction part of Pauling's paper is just as equally a discussion of the dipole polarizabilities of atoms and atomic ions. The dipole polarizability also expresses the second-order effect of an electric field on the energy levels of an atom or molecule. We can write, for an atom (which has no dipole moment and therefore, in general, no first-order effect): E
= E ~-~:r/72-...
(4)
where subscripts identifying which energy level are omitted. Since the Stark effect describes the phenomenon of shifts in spectral lines due to an electric field, it is apparent that it is governed by the polarizabilities of the species under investigation. The correct prediction of the Stark effect for the hydrogen atom (or of its polarizability) was one of the earliest triumphs of the new quantum mechanics. Prior to 1926, there had been a problem with the H atom polarizability formula as given by Epstein[4]: it gave the wrong results! The formula was based on Sommerfeld's extension of Bohr theory and it was Sommerfeld, himself, who recognized[5] that it was incompatible with the observations made by Takamine and Kokubu[6] in 1919 at the Mount Wilson Observatory. The shift of the whole splitting pattern of H to the red was confirmed but not its magnitude. It is not surprising, then, that when the galley sheets of Schrtidinger's first paper[7] started circulating, there would be a race to see if his new theory could clean up this embarrassing p r o b l e m - after all, this was only the H atom! The finish was almost a dead-heat between Gregor Wentzel[8] in Munich (received date: 18.6.1926), Ivar Waller[9] in Copenhagen (received date: 21.6.1926), and Paul
132 Epstein[10] in Pasadena (submitted dates: 24.7.1926 and 29.7.1926). A sense of the drama is given by the opening paragraph of Epstein's letter to N a t u r e "The theory of atomic oscillations recently advanced by Schroedinger is of extraordinary importance since it throws a new light on the problems of atomic structure and, at the same time, offers a convenient practical method for calculating the Heisenberg-Born intensity matrices. It seemed desirable to apply it to as many special cases as possible. A complete theory of the Stark effect in hydrogen was, therefore, developed."
In all three cases, the essential second-order energy shift for a field strength was given by -
-
~
n4(17n
16Z 4 mee 2
2
- 3 m 2 -9n~ + 19)F 2
(5)
Here Pauling's notation has been used for the quantum numbers, namely m = n2-n,
n3 = n-l-nl-n
2,
O
O
n is the principal quantum number and the notation for n3 differs by one from that of Wentzel. This formula differed from Epstein's of 1916 in the definition of n3 and the appearance of the final 19. The formula applies to all H-like atoms and Z is the nuclear charge - hence the energy shift scales by Z-4 for the mono-electronic series. Using the SI convention, this formula determines that c~ for H-like atoms should be c~ =
n
4
8Z 4 (lVn 2 - 3 m 2 -9n~ + 19)(4ruz0a03)
(6)
or, in atomic units (au), 4 O~
=
n
8Z 4
(17n~_3m2_9n~ +19)
(7)
For the ground state of the H atom, n = 1 and m = n 3 = 0, SO O~ = 9/2 au. In his own paper Pauling noted that giving equal weight to quantum numbers m and n 3 , and averaging, m 2 = n~ = (n 2 - 1) / 6, leads to n
4
o~ = -8~(15n2 + 21)
(8)
Needless to say, the AE formula, Eq (5), was in complete agreement with spectroscopic observations and was considered a triumph for the new quantum mechanics. Today, the value of ~ = 4.5 au for the ground state of H would be more simply derived by writing the first-order perturbed wavefunction as W0 + W, and, using the method of Coulson[ 11 ], we would find W ' = - 1/2 (rZ+2r)cos0 W0, yielding ~=-2(Ue~
133
Pauling's initiative in 1926 was to devise a way to use Eq. (8) for any atom or ion and sweep through the Periodic Table; as a chemist he would never be content with just the H atom, no matter how important it was for the justification of the new quantum mechanics, and as a crystallographer he was always interested in ions such as Na +, F-, CF. Realizing that effective nuclear charges had been useful in rationalizing the spectroscopic term values for many-electron atoms and ions, he applied the same concept here. That is he took Eq. (8), put in a sum over all valence electrons (which he called penetrating orbits) and replaced Z by Z - S, where S is a screening constant. He then used a mixture of old and new quantum mechanics to determine values of S for each sub-shell of the outermost electronic shell. However, he did not use these values directly but, rather, empiricized them, so that the experimental values of the molar refraction of the rare gases, for example, were reproduced. For the He isoelectronic series, where there is only one screening constant, this is found to be, with today's experimental and theoretical data, S = 0.403, since ~(He) = 1.383 au. His purely theoretically determined value was 0.391, and remarkably close. Incidentally, this leads to c~(He) = 1.343 au and could stand as the first post-Schrtdinger calculation of o~(He). However, the values of mole refraction which he reported for the rare gas isoelectronic series, were all anchored to the refractive index measurements of the time. To this extent they are semi-empirical and in Table 1 they are given as a values in atomic units. The conversion factor which was applied to Pauling's table was 4.5/1.691, that is c~(H) divided by his reported (exact) mole refraction. Though, in this chapter, we will concentrate on the He, Ne, Ar isoelectronic series, Pauling did cover the Cu +, Kr, Ag +, Xe, and Au + isoelectronic series as well. For these series and the argon series, he applied a second empirical correction which allowed the screening constant to change with Z, the parameters in the functional form of this change were chosen so that the refractive indices of the C1- and Br- ions in solution were duplicated. Table 1. Pauling's values of polarizabilities for the first three isoelectronic series, converted to atomic units. Values in parentheses are assumed empirical values.
HHe Li + Be 2+ B 3+ C~ N 5§
68.26 (1.37) 0.20 0.053 0.020 0.0090 0.0048
C 4N 302FNe Na + Mg 2+ AI3+ Sin+ pS+ S 6+ C17+
14370 193 26.3 7.05 (2.65) 1.216 0.633 0.362 0.224 0.144 0.096 0.067
Si4p3S 2C1Ar K+ Ca 2+ Sc 3§ Tin+ V 5+ Cr 6+ Mn 7+
2528 279 69.2 24.7 (11.0) 5.64 3.17 1.94 1.25 0.82 0.59 0.43
Pauling was quite aware of the lack of rigour and the crudity of some of his approximations; for example, using hydrogenic orbitals to define the screening constants
134 and only treating the valence electrons. As well his scheme depended on Zi / Z << 1 where Zi is the electronic charge in the ith shell. Nonetheless, he got rough numbers for a wide range of atoms and ions before anybody else did. As we will see the scaling procedure does stand up quite well for the He isoelectronic series when comparison is made with the best o~values we have today. Before going on to survey more recent calculations of ~, it is appropriate to recognize the contribution Pauling made to this discipline in the textbook[ 12] which he co-authored with E. Bright Wilson, Jr., published in 1935. It is still in print in its original edition. Though the preface would not now be considered politically correct, this treatise remains a mine of information. It is particularly relevant in the attention it pays to the more accurate and non-empirical calculations which were made of r during the period 1927-35.
3. Survey of Polarizability and Hyperpolarizability Calculations It is now seventy years since Pauling published his paper on the polarizability of atoms and atomic ions. In the interim, literally hundreds of articles have been published in this area and it would be impossible to consider all of them; necessarily, then, this survey will be selective. To bring us up to date it will be important to also include higher order polarizabilities (hyperpolarizabilities) and their dynamic counterparts: the frequency dependent polarizabilities and hyperpolarizabilities. The latter are properties which govern nonlinear optical processes and, as such, play a large part in the design of materials for electro-optical switching and all-optical communications. To keep this survey within reasonable bounds only the hydrogen atom and the three specific isoelectronic series He, Ne, and Ar will be considered. Furthermore, only ab initio calculations, that is to say completely theoretical calculations without recourse to empirical parameters, will be discussed. In principle any quantum mechanical method which can be used for determining atomic energy levels (variational and perturbation procedures) can be readily adapted to provide the static, or non-dynamic, polarizabilities and hyperpolarizabilities; it is simply a matter of adding the quantum mechanical operator which corresponds to the applied electric field to the Hamiltonian. The energy shift which results may then be expressed as Eq. (4) and o~ (or, in general, higher-order polarizabilities) determined. This means that we can find ~, etc., at the various levels of approximation which are used for energy calculations, for example, Hartree-Fock (HF), many body perturbation theory (MBPT), such as Mr Plesset second-order perturbation theory (MP2), density functional theory, and explicitly electron-correlated methods (based on Hylleraas-type wavefunctions). In fact common quantum-chemical programs such as GAUSSIAN and HONDO routinely include the polarizability as an optional property which may be found. For the dynamic counterparts, as will be discussed, the situation is not quite so routine and, thereby, more interesting. There are a number of reviews in the literature which the interested reader will want to consult. Three older ones, which contain much fundamental information that is still relevant, are by Dalgarno[13] on atomic polarizabilities, by Buckingham[14] on hyperpolarizabilities, and by Bogaard and Orr[15] on dynamic hyperpolarizabilities. More recently, Dykstra et al.[16], and Hasanein[17] have written reviews on the subject for
135 Advances in Chemical Physics as have Bishop[18] and Luo et al.[19] for Advances in Quantum Chemistry. Two reviews which cover both theoretical and experimental developments have been written by Bonin and Kadar-Kallen[20] for polarizabilities and by Shelton and Rice[21] for hyperpolarizabilities. Finally, though it is not a review as such, attention is drawn to the paper by Stiehler and Hinze[22] on the calculation of static polarizabilities and hyperpolarizabilities for the atoms He through Kr. This paper is so thorough in its reference to other calculations on these atoms that it deserves to be specifically singled out as an excellent source for theoretical data. We will divide the survey into three parts: (3.1) static dipole polarizabilities, (3.2) static dipole hyperpolarizabilities, and (3.3) dynamic dipole polarizabilities and hyperpolarizabilities. Within each part there will be sub-sections dealing with the three isoelectronic series He, Ne, and Ar. For (3.2) and (3.3) the hydrogen atom will also be included.
3.1. Static dipole polarizabilities (~) In the presence of a uniform static electric field (F) the energy of an atom may be expressed by the Taylor series[ 14] EF =
E 0 - 1 1 ~ " 2 - ~ 1 ~1[74 +...
(9)
and the induced dipole moment by 1 l.t F = aF + -g yF 3 +...
(10)
where o~ and y are the dipole polarizability and hyperpolarizability, respectively. In Eq. (9), on the grounds of symmetry, there are no terms in odd powers of F. E ~ is the unperturbed energy. If the perturbation operator due to the field, - z F (where z is the Cartesian coordinate), is added to the Hamiltonian then E F c a n be evaluated variationally, i.e., any parameters in the perturbed wavefunction (~y) chosen s o E v is a minimum. The form of E F = ( ~ 1 1 F I H 0 - - zF[~ F) may be written in a number of ways and when evaluated can be expressed as Eq. (9) and the properties ot and y can be deduced. Because of the variational quality of E F, parameters in wF will be chosen to maximize ot and y, once E ~ is minimal. There are both numerical and analytical ways of carrying out this procedure. The first is the easiest to understand and was first applied to the Hartree-Fock (HF) method by Cohen and Roothaan[23]. One simply takes various values of F (usually of the order of 0.001 au), finds the corresponding E F and makes a fit to Eq. (9). This is called the finite field method and it may be applied within the framework of any of the standard methods which determine energies, e.g., HF, MP2, MP4, coupled cluster (CC), MCSCF. Alternatively, analytical methods can be used such as coupled Hartree-Fock (CHF), where the perturbed HF equations are solved directly. Using standard perturbation theory one can also develop a sum-over-states (SOS) formalism and write
a - 2Z/[(gl21n)[= E.-Eg
(11)
136 where
Ig> and
In) are ground and excited electronic wavefunctions, respectively. The
prime in Eq. (11) indicates omission of the ground state. If the wavefunctions, and energies (En, Es) are taken to be the Hartree-Fock ones, then the method is called uncoupled Hartree-Fock (UCHF). However, we may choose electron-correlated wavefunctions and we may find them as a pseudospectral series as will be described later. This latter method has, in fact, achieved the greatest accuracy. Though numerical wavefunctions have been used for polarizability calculations, see Refs. [22] and [24], it is much more common to use analytical basis functions such as Gaussians. In this case, the proper optimization of the nonlinear parameters (exponents), in the context of the ot and T computations is crucial. Two recent papers address this problem[25],[26]. 3.1.1 The He isoelectronic series First, let us consider the hydride anion, H-. This is a highly correlated system since electron correlation is the sole source of the binding of the second electron. A CHF treatment therefore gives a spurious result for ~ of 91.40 au[27]. Pauling's value, see Table 1, was 68.26 au. A more reliable value is obtained from Eq. (11) and using explicitly correlated wavefunctions (ECW)[28]. Since this method will be frequently used for twoelectron systems, we will give the relevant details. The wavefunctions used in Ref. [28] were a generalization of those of Thakkar and Smith[29],[30]: Nllt2
tpL (rl,r2,r3, 01,02, ~, ~2) = ,~_~,~_~C~#,(l+P~_)r~r~,~,,tM'-(~,e2) (12)
Ill 2 k=l
x exp(-a~'t' r~ - fl,"" r 2 - r~'" r~2 ) where, in vector-coupling notation:
~_~( l~~12rn211~12LML)Y~ ( ~ )Y~ (P2)
(I 3)
ml,m2
and r~, 01, and r are the coordinates of electron 1; rl2 is the interelectronic separation and y m are the spherical harmonics. This form of the wavefunction allows for configurations of the type (pp) and these were incorporated into the ID states which were necessary in the T calculations (though not in the ~S states, where they were insignificant). The nonlinear parameters were chosen by
+
where <(x)) denotes the fractional part of x. In optimizing the paraUelotope parameters
A1, A2, etc., the following constraints were maintained
137 Ill2 ami n /1/2
in
=
rrfi.'n~ 1 / 2
--
>
0
(17)
minflj ''2 > 0
(18)
J J
rrfin y~12 > _ min(a~,~2, ~jll12)
(19)
J
The optimization was based on minimizing E ~ and maximizing o~ and y. The linear coefficients C~'t2 were chosen by diagonalizing the Hamiltonian matrix. The description just given applies to states required in y calculations as well as those for ct. By using some 100 odd nonlinear exponents and therefore the same number of basis functions and wavefunctions (which with their corresponding energies form a pseudospectral series), it was possible to determine o~(H-) from the SOS formulation with great accuracy. The best value 206.2 au falls within the bounds determined by Glover and Weinhold[31]. It is also in excellent agreement with a value determined by Bhatia and Drachman[32] which uses an alternative basis set. For the He atom there has been an extremely large number of calculations using all kinds of different methods. The most accurate value has, again, been found with the ECWSOS method used for H-and is o~ = 1.38319 au[33] (with the atomic unit based on a reduced electron mass appropriate for the He nucleus); the experimental value is 1.38320 + 0.00007 au[34]. Other accurate values are to be found in Refs. [27], [31], and [35]. We recall that Pauling's value, using his theoretically determined screening constant, was 1.343 au. The next member of the isoelectronic series is Li + and here I would like to add a personal note. In 1984, together with Claude Pouchan[36], I made a finite field calculation (using a point charge rather than an electric field) to determine t~(Li), o~(Li+) and ct(Li-). We obtained the value o~(Li+) = 0.188 au. Following publication in Physical Review A, we received a letter (dated March 9, 1984) from the Linus Pauling Institute of Science and Medicine and signed by its eponym. The letter pointed out, in friendly terms, that he, Linus Pauling, had found the value 0.194 au in 1927 (actually, a more exact conversion of the original number would be 0.20 au). What struck us, was that, at the age of 83, Pauling was still keeping up with the literature and that he was, perhaps, gently chiding us that we had not referenced his earlier work. The most accurate value for Li § is found, via the ECW-SOS method, to be 0.19245 au[37] and is very close to Pauling's corrected value (see Table 1). Similar values are to be found in Refs. [27], [32], and [38]. The whole isoelectronic series has been investigated by Chen[39], Harbola[40] (using the density functional approach), and by McEachran et al.[41] The ECW-SOS values[37] for the series (nuclear charge = Z) are: 1.3832 (Z=2), 0.1924 (Z=3), 0.0523 (Z=4), 0.0196 (Z=5), 0.0090 (Z=6), 0.0047 (Z=7), 0.0027 (Z=8), 0.0016 (Z=9), and 0.0010 au (Z=10). If the helium value is used to define the screening constants, i.e. 1.3832 = 9(Z - S) -4, then the rest of the series are quite accurately predicted: 0.198 (Z=3), 0.054 (Z=4), 0.020 (Z=5), 0.009 (Z=6), 0.005 (Z=7), etc.; the anomaly, of course, is H- for which the prediction is 70.76 au.
138
3.1.2 The Ne isoelectronic series There have been many calculations of o~ for neon itself at various levels of approximation. Some of these are summarized in Ref. [26] and a basis set analysis has been made in Ref. [25]. Of the electron-correlated resuks[42]-[47], there appears to be a consensus that the best value using coupled-cluster-single-doubles wavefunctions with a perturbational estimate of connected triplet excitations, CCSD(T), is 2.680-2.690 au[25],[44],[46],[47]. For the rest of the isoelectronic series, we will first look at F-, since it has been very well studied[48]-[52]. The reason for this interest is the large amount of electron correlation present. In fact, the ot value practically doubles upon its introduction. The HF value is 10.67 au[25] (or 10.62 au[48]) and the CCSD value was estimated by Kucharski et al.[51] to be from 18 to 20 au; recently Woon and Dunning placed the CCSD(T) value at 17.15 au. Pauling's value of 7.05 is of similar size to the HF value but necessarily far from the correlated value. For the rest of the isoelectronic series there are far fewer results: the CHF ones of McEachran et al.[41], and the uncoupled HF ones of Yoshime and Hurst[53]. The former are the more reliable. There appear to be no electron-correlated values. From Ref. [41], for example, o~(Na+) = 0.9454, o~(Mgz+) = 0.4701, o~(A13+) = 0.2652 au. The corresponding Pauling values (see Table 1) are 1.216, 0.637 and 0.362 au. Density functional methods within the local density approximation have been applied to Na + and Mg z+, they give o~(Na+) = 1.07, o~(Mgz+) = 0.51 au (see values quoted in Ref. [54]) - this is the closest we have so far come to accounting for correlation.
3.1.3 The Ar isoelectronic series As we go further down the Periodic Table, the number of calculations becomes decidedly smaller. For argon, itself, electron-correlated values are to be found in Refs. [25], [45], [47], [55], [56] and [58]. The best values, CCSD(T), range from 11.07 to 11.17 au. The isoelectronic series has been explored in Refs. [41] and [53]. In the former, through CHF calculations, we find ot(K+) = 5.461, o~(Caz+) = 3.261, ot(Sc 3+) = 2.137, ct(Ti4+) = 1.490 au. Pauling's values agree quite well, in the same order, they are 5.64, 3.17, 1.94 and 1.25 au. For the chloride ion[25],[52],[58]-[60], there is a MP4 result of 38.01 and a CCSD(T) one of 37.43 au. The Pauling value, as for all anions, is an underestimate: 24.7 au. For the K + and Ca z+ anions, we appear to only have the density functional values in Ref. [54], these are 11.44 and 5.60 au, respectively. They are roughly twice the size of those in Table 1, but do agree with experimental results and it would seem that the Pauling approach is failing.
3.2 Static dipole hyperpolarizabilities We now turn our attention to the higher order polarizabilities (T etc.) in Eq. (9), which were named hyperpolarizabilities by Coulson et al.[61]. The development of the theory
139 and computation of these properties came much later than that of the polarizabilities. The impetus for this is largely due to Buckingham and the review he wrote in 1967[ 14]. The amount of interest they are receiving grows every year and, though Pauling had no impact on this field, it is natural to briefly consider them in this chapter, since they are a logical extension of the polarizabilities which have just been discussed. 3.2.1 The H atom
For the hydrogen atom, using the method of Coulson[11], Bishop and Pipin[62] have calculated 7 and the next higher-order non-zero hyperpolarizability (X5) analytically. As well, a number of other mixed dipole-quadrupole hyperpolarizabilities were determined in this paper. The first calculation of 3'(H) was made by Sewell[63]. The hydrogen-like atoms have "f values which scale exactly as Z -l~ where Z is the nuclear charge. Consequently, the rest of the series is easily determined from 3t(H) = 10665/8 au. 3.2.2 The He isoelectronic series
For this series the most reliable values are found by the ECW-SOS method. For 7 the SOS formula is[37] ~'= 24[Z/m.,,.p (gl21m)(m[2 [n)(nl2[p)(p]2] g) _ ~-"/mp (gJ2lm)(m]2] g)(gl2lp)(p[21 g)] ( E m - Eg ) ( E n -
Eg )(Ep - Eg )
'
(Er,, - Eg )(E e - Eg )2
(20)
where the same notation as in Eq. (11) has been used. With the pseudospectral wavefunctions and energies as described in section 3.1.1, values of 7 for H-, He, Li+, etc. have been found. They are given in Refs. [28], [33], and [37]. For H- the rate of convergence with number of states included in Eq. (20) was quite slow and very large basis sets were required. The best value[28] found was 7(H-) = 8.03x107 au. For He, the convergence was better and a value of "f(He) = 43.104 au was reported[33], this agreed very well with an earlier variation-perturbation treatment by Buckingham and Hibbard[35]. For the rest of the series[37] the values published are 0.243 (Z=3), 0.008 (Z=4), 0.0007 au (Z=5), etc. If the formula
r
= (10665/4)(Z-S)
-1~
(21)
is used, and this is the 7 counterpart to Pauling's approach to ~, then S must equal 0.489 in order to reproduce the 7(He) = 43.104 au value. The rest of the series is then 0.268 (Z=3), 0.009 (Z=4) and 0.0008 au (Z=5). So the scaling procedure is quite satisfactory. It is, however, bad for H- giving a value of 2.2x106 au. 3.2.3 The Ne isoelectronic series
Since 1986 there have been quite a few electron-correlated calculations of 3t(Ne), usually based on many-body perturbation theory. However, the results have been chopping and changing (see Table 2) due to basis set size and the care with which various
140 orders of perturbation theory are implemented. In tandem with this situation the experimental value has been 'fluctuating'. Table 2 shows the course of events over the last nine years. It appears that there is now a degree of stability with a value of 106-+5 au encompassing most of the theoretical and experimental results. Other investigations, not given in Table 2 are those of Bishop and Cybulski[57], Rice et al.[45], and Hettema et al.[73]. Table 2. Chronology of experimental and theoretical values for 7 for Ne, in atomic units.
Year 1968 1986 1988 1989
Experiment 101 _+8[64]
1990 1991 1992
119.2168]
1993 1995
Theory 104.6142]
84 + 9[65] 116 -+ 2[66]
108_+2171]
86.5[67] 113.9143] 119 + 4144a] 111169] 99 + 6[70] 110 + 3[46] 111 + 4144b] 106 _+5[72] 108126]
The only other member of the series to have been investigated in any significant manner is the fluorine anion, F-. Several values of 7(F-) are given in Ref. [26]. There seems to be reasonably good agreement for this quantity at the HF level of approximation: 1.3• au[74] and 1.1• au[26]. But, at a correlated level, the result changes enormously with the level of the correlation approximation. This is not surprising for such a highly correlated ion. The configuration interaction (singles and doubles) result is 2.23• au[75] and from Ref. [26] the MP2 result is 28.4• But the MP4 (SDTQ) value is 5.1x104 au which shows how difficult it is going to be to pin down this property precisely for this ion. There are only uncoupled HF results for the rest of the series[76] but this method is known to be unreliable. 3.2.4 The Ar isoelectronic series
At the SCF level there is quite good agreement between different groups on the value of 7(Ar), e.g., 958.9 au[77], 966 au[45], 974 au[25], and 947 au[56]. At the correlated level several values have been found: 1123 au[56], 1220 au[45], 1174 au[25], and 1329 au[55]. For C1- there exists a single electron-correlated value of 7, that given by Kell6 et al.[59] for an MP4 treatment, it is 1.285x105; their SCF value was 0.577x105 au and this shows the importance of correlation for this ion. All other members of this isoelectronic series have only been investigated at the uncoupled HF level[76].
141
3.3 Dynamic dipole polarizabilities and hyperpolarizabilities When Pauling published his famous paper in 1927, it was several decades before the invention of the laser. This was an event which would transform scientific research in the second half of the century, not least of all for the investigation of hyperpolarizabilities. If, in Eq. (10), electric fields associated with light are introduced (i.e. dynamic fields) then the F 3 factor becomes Fo~F~ Fo~ where co i are the frequencies of the fields, some of which may be zero (static). The frequency of the polarization, S , will become frequency dependent with a frequency co ~ = COl + co 2 + co 3 and 7 will be written as 7( co~; col, c02, r The different nonlinear optical (NLO) processes will be identified by the values of r r and 0)3, for example, for the Kerr effect 0~ = co, {02 = {03 = 0, for electric-field second-harmonic generation (ESHG) col = co 2 = co and co 3 = 0, for third harmonic generation (THG) r = co 2 = co 3 = co, and for degenerate four wave mixing (DFWM) col = c02 = co, COB= --CO. The Kerr effect, which governs birefxingence (change of refractive index) in a static electric field, was discovered in the last century, but the other phenomena were only detected when very intense light sources (lasers) became available. The frequency dependence of the polarizability governs the change of refractive index with the frequency of the fight source. In Eq. (11), F becomes Fo~ and we can write c~ either as 0~(r or 0~(--0~; r with r = co and r = co. The theory for c~(o~) was developed early on and there have been many calculations; this is the linear effect. Computation of 7(0~), the nonlinear property, is a fairly recent departure and it is spurred on by the potential that non-linear processes have for commercial exploitation. Whereas for atoms, the static hyperpolarizability tensor 7(0; 0, 0, 0), has only one independent component, for 7(---r col, c02, r there are a maximum number of three (how many depends on the values of r and they are identified by Cartesian axis subscripts on y, in general y ~ where r 13, y, 8 = x, y, z. With these generalizations, the SOS formulas for c~ and 7, Eqs. (11) and (20), become aa#(-f.Ocr;(O1) =
h-lZpZ/,,, (glfialmXml~olg) (co~ - co~)
(22)
and ;
'
, )
=
F
(glft, lm)(mlGIn)(nlrt lk)(kll
l g)
-Z',.(gl~lm)(ml~alg)(glft'ln)(nl~lg)], (co~-o)~)(co.-~)(co. +092)
In these equations ~ p
(23)
infers a sum over the terms obtained by permuting the pairs
(-co~,/2~), (cq,/2a), etc., /2n is the dipole moment operator along the 1"1 axis, and
boo,,,--E,,,-Eg. Bishop[78].
A full discussion of the origin of these terms has been given by
142
The frequency dependence of o~(o~)is well understood and it is usually expressed by a Cauchy series. For y(o~) it is only recently that a thorough description has become available[79]-[83]. In general, however, y(o~) is calculated for specific frequencies (o~) and individual NLO processes. As well as their importance in NLO processes the ~(r and y(o~) with imaginary frequencies, i.e., ~(-io9; io9) and y(-ico; ica 0, 0), play an important role in the calculations of van der Waals dispersion coefficients. We will not discuss this aspect and the reader is referred to, for example, Refs. [84] and [85]. We will now give a very brief survey of this area in as far as it pertains to the atoms and ions previously discussed. 3.3.1 The H atom
There have been several papers published on o~(o~) and y(o~) for the hydrogen atom[85]-[90]. Shelton[89] used an expansion in Sturmian functions to obtain T values for Kerr, ESHG, THG and DFWM at a number of frequencies. A more straightforward and simpler method is to use the SOS approach and a pseudospectral series based on the wavefunctions formed by the linear combinations:
~'~t:Z2k~
(24)
where l = 0,1,2 and 3 for the S, P, D, and F states, respectively. The linear coefficients are found by diagonalization of the Hamiltonian matrix and the nonlinear parameters are chosen in a pseudorandom fashion from
(25) where ((x)) denotes the fractional part of x. This has been done for o~(r y(co)[85], though the results for T(r of Shelton[89].
and
were not reported since they were identical to those
3.3.2 The He isoelectronic series
The dynamic polarizability, o~(o~), has been accurately calculated for He with the ECSSOS method described in 3.1.1 and as modified to incorporate the optical frequencies. The results are given in Ref. [85] and fall within the bounds determined by Glover and Weinhold[31 ]. Other recent calculations involving r162 for He are Refs. [90]-[92]. Values of 7(r for several NLO processes and frequencies are given for He in Ref. [33]. These numbers have a particular significance since they are used to calibrate ESHG measurements[21]. They were found by using ECW wavefunctions in conjunction with Eq. (23). To all intents and purposes they have replaced, in this domain, the 1968 numbers of Sitz and Yaris[93]. For H-the same approach, ECW-SOS, has been used and the results for ot(o~) and 7(to) are reported in Ref. [28]. Other calculations for these properties are to be found in Refs. [31], [90] and [94]-[96].
143 For Li+, frequency-dependent polarizabilities, only, have been evaluated[31],[94], [97],[98]. 3.3.3 The Ne and Ar isoelectronic series
The frequency dependence of y(co) for neon was something of a cause cdlObre a few years ago, with papers appearing with titles like "Anomalous Hyperpolarizability Dispersion Measured for Neon" and "The Hyperpolarizability Dispersion of Neon is not Anomalous". The furor started when Shelton[66] reported ESHG measurements showing an unconventional, initial decrease in y with frequency. This prompted a comment by Bishop[99] that, on the basis of known sum rules and the SOS formulas, this was theoretically most unlikely. This position was confirmed by Jensen et al.[70] on the grounds of specific calculations for t,((o). The problem was resolved when Shelton and Donley[71] reported that there had been an artefact in the original measurements. Other evaluations of y(o~) for Ne have been made by Jaszunski and Yeager[67] and by Hettema et al.[73]. For ~(co) the most recent investigation are those of H~ittig and Hess[92], and, using time-dependent density functional theory, of van Gisbergen et al.[lO0]. For the fluoride ion only ~(co) has been investigated[96]. For argon, the most recent and most thorough calculations are those of Jaszunski et al.[56], using MCSCF calculations. Both o~(o~) and y(e0) values were reported. For the chloride ion there only exists a limited amount of data and that only for o~(o~); see Refs. [601 and [96]. 4. Conclusions and Other Aspects
Not included in the previous survey are the efforts that have been made to relate polarizabilities to other properties. This would have been close to Pauling's heart since he was always, as a chemist, after trends and relationships. To make amends, the reader is referred to recent publications on connections to atomic ionization potentials[ 101 ], atomic softness[ 102], bond dissociation energies[ 103], and electronegativities[ 104]. To conclude, the first thing that should be said is that Pauling's Royal Society paper was ahead of its time. Even today, some of the ions he considered have yet to undergo a full ab initio treatment. Second, the paper shows Pauling's ability, seen in other areas as well, to take the mathematics of a simple system and extend and simplify it, so it could be used for a large range of species. Third, this paper shows that Pauling was not afraid to use experimental data if it served his purpose. Finally, he wasn't content with the properties of a single species, he wanted the whole Periodic Table!
144 References
.
3. 4. 5. 6. o
8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33.
I have used the following sources for this section: "Linus Pauling: Scientist and Advocate" by David E. Newton (Facts on File Inc., N.Y., 1994); "Force of Nature: the Life ofLinus Pauling" by Thomas Hager (Simon and Schuster, N.Y., 1995); "Linus Pauling: A Life in Science and Politics" by Ted Goertzel and Ben Goertzel (HarperCollins Publ. Inc., N.Y., 1995). L. Pauling, Z. f'tir Physik, 49, 334 (1926). L. Pauling, Proc. Roy. Soc. (London), All4, 181 (1927). P.S. Epstein, Ann. Physik, 50, 489 (1916). A. Sommerfeld, Ann. Physik, 65, 36 (1921). T. Takamine, N. Kokubu, Memoirs of the College of Science (Kyoto Imperial University), 3, 271 (1919). E. Schr6dinger, Ann. Physik, 79, 361 (1926). G. Wentzel, Z. ftir Physik, 38, 518 (1926). I. Waller, Z. fiir Physik, 38, 635 (1926). P.S. Epstein, Nature, 118, 444 (1926); Phys. Rev., 28, 695 (1926). C.A. Coulson. Proc. Roy. Soc. Edin., A61, 20 (1941). L. Pauling, E. Bright Wilson, Jr., Introduction to Quantum Mechanics (with applications to chemistry) (McGraw Hill, N.Y., 1935). A. Dalgarno, Adv. Phys., 11, 281 (1962). A.D. Buckingham, Adv. Chem. Phys., 12, 107 (1967). M.P. Bogaard, B.J. Orr., International Review of Science, Physical Chemistry, Molecular Structure, and Properties, edited by A.D. Buckingham (Butterworths, London, 1975), Ser. 2, Vol. 2, p. 149. C.E. Dykstra, S.-Y. Liu, D.J. Malik., Adv. Chem. Phys., 75, 37 (1989). A.A. Hasanein, Adv. Chem. Phys., 85, 415 (1993). D.M. Bishop, Adv. Quant. Chem., 25, 1 (1994). Y. Luo, H./~gren, P. J0rgensen, K.V. Mikkelsen, Adv. Quant. Chem., 26, 165 (1995). K.D. Bonin, M.A. Kadar-Kallen, Int. J. Mod. Phys. B, 8, 3313 (1994). D.P. Shelton, J.E. Rice, Chem. Revs., 94, 3 (1994). J. Stiehler, J. Hinze, J. Phys. B, 28, 4055 (1995). H.D. Cohen, C.C.J. Roothaan, J. Chem. Phys., 43, $34 (1965). T. Voegel, J. Hinze, F. Tobin, J. Chem. Phys., 70, 1107 (1979). D.E. Woon, T.H. Dunning, Jr., J. Chem. Phys., 100, 2975 (1994). M.G. Papadopoulos, J. Waite, A.D. Buckingham, J. Chem. Phys., 102, 371 (1995). P.W. Fowler, P. JOrgensen, J. Olsen, J. Chem. Phys, 93, 7256 (1990). J. Pipin, D.M. Bishop, J. Phys. B, 25, 17 (1992). A.J. Thakkar, V.H. Smith, Jr., Phys. Rev. A, 15, 1 (1977). A.J. Thakkar, J. Chem. Phys., 75, 4496 (1981). R.M. Glover, F. Weinhold, J. Chem. Phys., 65, 4913 (1976). A.K. Bhatia, R.J. Drachman, J. Phys. B, 27, 1299 (1994). D.M. Bishop, J. Pipin, J. Chem. Phys., 91, 3549 (1989).
145 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72.
D. Gugan, G.W. Michel, Mol. Phys., 39, 783 (1980). A.D. Buckingham, P.G. Hibbard, Symp. Farad. Soc., 2, 41 (1968). C. Pouchan, D.M. Bishop, Phys. Rev. A, 29, 1 (1984). D.M. Bishop, M. R6rat, J. Chem. Phys., 91, 5489 (1989). G. Maroulis, D.M. Bishop, J. Phys. B, 19, 369 (1986). M.K. Chen, J. Phys. B, 28, 1349 (1995). M.K. Harbola, Chem. Phys. Lett., 217, 461 (1994). R.P. McEachran, A.D. Stauffer, S. Greita, J. Phys. B, 12, 3119 (1979). I. Cemusak, G.H.F. Diercksen, A.J. Sadlej, Phys. Rev. A, 33, 814 (1986). G. Maroulis, A.J. Thakkar, Chem. Phys. Lett., 156, 87 (1989). (a) P.R. Taylor, T.J. Lee, J.E. Rice, J. Alml6f, Chem. Phys. Lett., 163, 359 (1989); (b) P.R. Taylor, T.J. Lee, J.E. Rice, J. Alml6f, Chem. Phys. Lett., 189, 197 (1992). J.E. Rice, P.R. Taylor, T.J. Lee, J. Alml6f, J. Chem. Phys., 94, 4972 (1991). J.E. Rice, G.E. Scuseria, T.J. Lee, P.R. Taylor, J. Alml6f, Chem. Phys. Lett., 191, 23 (1992). A. Nicklass, M. Dolg, H. Stoll, H. Preuss, J. Chem. Phys., 102, 8942 (1995). S. Wilson, A.J. Sadlej, Theoret. Chim. Acta, 60, 19 (1981). G.H.F. Diercksen, A.J. Sadlej, Mol. Phys., 47, 33 (1982). C. Nelin, B.O. Roos, A.J. Sadlej, P.E.M. Siegbahn, J. Chem. Phys., 77, 3607 (1982). S.A. Kucharski, Y.S. Lee, G.D. Purvis III, R.J. Bartlett, Phys. Rev. A, 29, 1619 (1984). A.K. Das, D. Ray, P.K. Mukherjee, Theor. Chim. Acta, 82, 223 (1992). M. Yoshimine, R.P. Hurst, Phys. Rev., 135, A612 (1964). M.K. Harbola, Phys. Rev. A, 48, 2696 (1993). I. Cernusak, G.H.F. Diercksen, A.J. Sadlej, Chem. Phys. Lett., 128, 18 (1986). M. Jaszunski, P. Jc~rgensen, A. Rizzo, Theor. Chim. Acta, 90, 291 (1995). D.M. Bishop, S.M. Cybulski, Chem. Phys. Lett., 200, 153 (1992). G.H.F. Diercksen, A.J. Sadlej, Chem. Phys. Lett., 84, 390 (1981). V. Kell6, B.O. Roos, A.J. Sadlej, Theor. Chim. Acta, 74, 185 (1988). M. Kutzner, H.P. Kelly, D.J. Larson, Z. Altun, Phys. Rev. A, 38, 5107 (1988). C.A. Coulson, A. Maccoll, L.E. Sutton, Trans. Farad. Soc., 48, 106 (1952). D.M. Bishop, J. Pipin, Chem. Phys. Lett., 236, 15 (1995). G.L. Sewell, Proc. Camb. Phil. Soc., 45, 678 (1949). A.D. Buckingham, D.A. Dunmur, Trans. Farad. Soc., 64, 1776 (1968). D.P. Shelton, Z. Lu, Phys. Rev. A, 37, 3813 (1988). D.P. Shelton, Phys. Rev. Lett., 62, 2660 (1989). M. Jaszunski, D.L. Yeager, Phys. Rev. A, 40, 1651 (1989). D.P. Shelton, Phys. Rev. A, 42, 2578 (1990). D.P. Chong, S.R. Langhoff, J. Chem. Phys., 93, 570 (1990). H.J.Aa. Jensen, P. Jgtrgensen, H. Hettema, J. Olsen, Chem. Phys. Lett., 187, 387 (1991). D.P. Shelton, E.A. Donley, Chem. Phys. Lett., 195, 591 (1992). O. Christiansen, P. Jr Chem. Phys. Lett., 207, 367 (1993).
146 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104.
H. Hettema, H.J.Aa. Jensen, P. JOrgensen, J. Olsen, J. Chem. Phys., 97, 1174 (1992). G. Maroulis, D.M. Bishop, Mol. Phys., 57, 359 (1986). G.H.F. Diercksen, A.J. Sadlej, J. Chem. Phys., 76, 4293 (1982). P.W. Langhoff, J.D. Lyons, R.P. Hurst, Phys. Rev., 148, 18 (1966). D.M. Bishop, S.M. Cybulski, Chem. Phys. Lett., 211, 255 (1993). D.M. Bishop, J. Chem. Phys., 100, 6535 (1994). D.P. Shelton, J. Chem. Phys., 84, 404 (1985). D.M. Bishop, Phys. Rev. Lea., 61,322 (1988). D.M. Bishop, Chem. Phys. Lett., 153, 441 (1988). D.M. Bishop, J. Chem. Phys., 90, 3192 (1989). D.M. Bishop, D.W. De Kee, submitted. D.M. Bishop, J. Pipin, J. Chem. Phys., 97, 3375 (1992). D.M. Bishop, J. Pipin, Int. J. Quant. Chem., 45, 349 (1993). T.Y. Chang, J. Chem. Phys., 56, 1745 (1972). I. Shimamura, J. Phys. Soc. Jap., 40, 239 (1976). I.V. Bondarev, S.A. Kuten, I.E. Lantsov, J. Phys. B, 25, 4981 (1992). D.P. Shelton, Phys. Rev. A, 36, 3032 (1987). D. Spelsberg, T. Lorenz, W. Meyer, J. Chem. Phys., 99, 7845 (1993). Z.W. Liu, H.P. Kelly, Theor. Chim. Acta, 80, 307 (1991). C. Hattig, B.A. Hess, Chem. Phys. Lett., 233, 359 (1995). P. Sitz, R. Yaris, J. Chem. Phys., 49, 3546, (1968). K.T. Chung, Phys. Rev. A, 4, 7 (1971). C.A. Nicolaides, T. Mercouris, N.A. Piangos, J. Phys. B, 23, L669 (1990). M. Kutzner, M. Felton, D. Winn, Phys. Rev. A, 45, 7761 (1992). T.J. Venanzi, B. Kirmaan, J. Chem. Phys., 59, 523 (1973). Y.Y. Dmitriev, Y.B. Malykhanov, B. Rus, Opt. Spectrosc. (USSR), 57, 334 (1984). D.M. Bishop, Phys. Rev. Lett., 65, 1688 (1990). S.J.A. van Gisbergen, J.G. Snijders, E.J. Baerends, J. Chem. Phys., 103, 9347 (1995); and references therein. B. Fricke, J. Chem. Phys., 84, 862 (1986). P. Fuentealba, O. Reyes, J. Molec. Struct. (Theochem.), 282, 65 (1993). U. Hohm, J. Chem. Phys., 101, 6362 (I 994). S. Hati, D. Datta, J. Phys. Chem., 99, 10742 (1995).
Z.B. Maksid and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modern Modelling of the Chemical Bond
147
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Molecular polarizabilities and magnetizabilities Ps Dahle, Kenneth Ruud, and Trygve Helgaker Department of Chemistry, University of Oslo P. O. Box 1033 Blindern, N-0315 Oslo, Norway Peter R. Taylor Department of Chemistry and Biochemistry University of California, San Diego and San Diego Supercomputer Center, P.O.Box 85608, San Diego, CA 92186-9784 USA
1
Introduction
The ab-initio calculation of molecular properties is a useful tool in chemical research, bridging the gap between observation and interpretation. The agreement between calculation and experiment may lend support to tile theoretical interpretation of a given molecular property or help unravel the mechanistic reasons for a given molecular behavior, thereby assisting in the understanding, prediction, and design of molecules with specific properties. In chemistry, a vast range of experimental observations has been rationalized into simple, yet. powerful rules for understanding molecular properties and reactivity. Linus Pauling, in particular, emphasized the importance and usefulness of such rules, based on simple physical pictures of the molecules and their electronic structure. Modern ab-initio theory complements the use of such rules, providing accurate numbers which may themselves be interpreted in terms of these rules or which in other cases may be used to elucidate the rules, exploring their range of applicability and providing a better understanding of the cases where the simple rules of thumb fail. In the present paper, we discuss--from the perspective of ab-initio calculations
148 and empirical r u l e s ~ t w o important classes of properties: the molecular polarizability and the molecular magnetizability. In modern ab-initio theory, once the choice of the approximate wave function has been made, the evaluation of a given molecular property may be carried out in a black-box manner, using any of the standard methods for extracting molecular properties from the wave function. The quality of such calculations depends critically on Our choice of the approximate wave function--that is, on the choice of one-electron basis (basis set) as well as on the choice of the N-electron model. This is particularly true for many molecular properties, which place very strict demands on the flexibility of the one-electron basis set in which the N-electron wave function is expanded. Basis sets developed for the accurate calculation of ground state energies may often give poor results for properties whose associated perturbations probe regions of the electronic structure that make little or no contribution to the energy of the unperturbed system. Two important examples are the polarizability and magnetizability of a molecule, which describe the second-order responses of the system to external electric and magnetic fields, respectively. If these properties are calculated using standard basis sets developed for accurate calculations of ground-state energies, the resulting numbers will be in poor agreement with experiment. Thus, care must be exercised in selecting a basis that has the flexibility required to represent accurately the perturbations associated with a given molecular property. Until recently, the calculation of molecular magnetizabilities was hampered by the fact that no practical solution had been found or implemented to overcome the basis-set problem in calculations involving an external ma.gnetic field. As a result, the calculation of nuclear shieldings and magnetizabilities was a highly unreliable business, fraught with convergence problems and issues arising from the dependence of calculated properties on the choice of gauge origin, and the calculated values for molecular magnetizabilities were often in poor agreement with experiment even when rather large basis sets were used. In recent years, this situation has been reversed and the calculation of molecular magnetizabilities may now be considered a straightforward task. We do not here a t t e m p t to review the developments of methods for calculating molecular polarizabilities and magnetizabilities, nor will we a t t e m p t any review of theoretical results obtained for these properties. We will ins t e m outline the approach for calculating molecular properties in general, and the polarizability and the magnetizability of closed-shell systems in particular, using ab-initio methods. We will also devote some time to discuss
149 the problem of gauge-origin dependence in the calculation of magnetic properties and discuss the approach we have taken to overcome this obstacle. We will also discuss some of our recent calculations of the magnetizability and the static polarizability of larger molecules, some of which touch upon early work by Linus Pauling.
2
M o l e c u l a r p r o p e r t i e s as e n e r g y d e r i v a t i v e s
Let us consider the electronic energy E (x) as a function of some external parameter x. When a molecular electronic system is perturbed in some manner, its total electronic energy changes and may be expressed in terms of a Taylor expansion as
E (~) - # o ) + ~rE(') + ~-~rE(~)~ + O
(~z)
(1)
The coefficients E(") that appear in this expansion describe the response of the molecular system to the external perturbation x and are known as molecular properties. The molecular properties are characteristic of the molecular system and its quantum state. When the perturbation is static, these properties may be calculated by differentiation at x - 0
dE
E(a)_
E(2)
(2)
dxo d2E
--
(3)
dx2 o
and are then referred to as time independent or static. Examples of molecular properties are the molecular forces (the first derivatives with respect to nuclear displacements), the molecular force constants (the second and higher derivatives with respect to nuclear displacements), the magnetic nuclear shielding constants (the second derivatives with respect to nuclear magnetic moments and the external magnetic field) and the nuclear spinspin coupling constants (the second derivatives with respect to the nuclear magnetic moments of the coupled paramagnetic nuclei). Of special interest to us are the properties related to the external application of uniform static electric and magnetic fields, which we will denote by F and B, respectively. To second order, the energy of a closed-shell molecular system may be written as E
B)- Zo-
+... Z
z
(4)
150 where # is the
permanent molecular electric dipole moment dE[ #-
a the
(5)
dF o
dipole polarizability tensor d2E -
and ~ the
dF2 o
(6)
molecular magnetizability tensor d2E ! (-
(7)
dB: o
which are all evaluated at zero field. In this expansion, we have taken into account the fact that all static odd-order magnetic perturbations vanish for closed-shell molecules, as we will later show for the first-order magnetic interaction, the magnetic dipole moment. There are two different approaches that may be taken to the calculation of static molecular properties: The associated energy derivatives may be calculated numerically or analytically. The numerical procedure involves the evaluation of derivatives by finite differences or polynomial fitting; the analytical procedure involves the calculation of derivatives directly from analytical expressions. The analytical approach requires considerable programming effort but offers greater speed, precision, and convenience tha.n does the numerical approach, which may experience difficulties related to numerical stability and computational efficiency. The numerical approach is simple in the sense that, at any level of electronic structure theory, it does not require special programming provided the perturbation is r e a l ~ we may simply repeat the calculation of the energy for different values of the perturbational parameter. However, if the perturbation is imaginary or complex, great complications ensue because virtually all quantum-chemistry codes assume the Hamiltonian is real. In fact, for most properties of general interest and importance in quantum chemistry, the analytical approach is the preferred one--this is especially true for the molecular gradient, whose analytical evaluation is vastly superior to the numerical approach for all but the smallest systems. Let us consider the analytical evaluation of molecular properties. We shall here write the electronic energy function in the form g (x; ,~), where x is a set of external parameters that characterize the physical system, and ,k a set of wave-function or electronic parameters that determine the electronic
151 state. We shall think of the external parameters x as representing the electric field F or the magnetic induction B but note that the results obtained here hold for other perturbations as well. The electronic parameters X represent (directly or indirectly) any set of parameters in terms of which the electronic wave function is expressed. To keep things simple, we shall first assume that the electronic energy is fully variational with respect to the electronic parameters X. Thus, we shall assume that the electronic energy may be calculated from the expression E (x) - e (x; X*)
(S)
where the parameters X" represent the optimal value of A and where, for all x, the optimized energy function C (x; X*) satisfies the variational conditions
0E (x;
- 0
(9)
with the partial derivatives calculated at X - X*. To ensure that the variational conditions Eq. 9 are always fulfilled, the electronic parameters X must change in a particular manner as the perturbation x is turned on. The variational conditions therefore implicitly determine the dependence of the electronic parameters ,k (x) on x. Let us consider the first-order properties for the optimized variational energy E (x) in Eq. 8. Using the chain rule, we obtain
dE (x) _ 0~" (x; ,k) dx
-
0x
Z
+
0Z (x; X) 0X
0X 3X
(10)
where the differentiation is carried out at X - X*. The first term on the right-hand side of this equation represents the explicit dependence of the electronic-energy function on x and arises, for example, from the dependence of the Hamiltonian on the external field; the second term represents the implicit dependence of the energy function on x and arises since the wave function changes as the field is turned on. The derivatives of the electronic parameters with respect to the external parameters 0 X / 0 x tell us how, to first order, the wave function changes when the perturbation is applied. Combining Eq. 9 and Eq. 10, we obtain the following simple expression for first-order properties (e.g., for the permanent electric and magnetic dipole moments) for a fully variational wave function:
dE (x) dx
=
0~ (x; X) 0x
(11)
152 In short, to calculate first-order properties for a fully variational wave function, we need not evaluate the response of the wave function OA/Ox. This is an extremely important result, which forms the basis for all computational techniques developed for the evaluation of molecular gradients (as well as for all higher-order properties). We now proceed to consider second-order properties such as the polarizability and magnetizability tensors. Differentiating the first-order property Eq. 11, we obtain from the chain rule d2E(x)
[(00A
=
O ) Og(x;A)][
Ox 2
~t
+
OxOX [ Ox
(13)
~t
We conclude that, for a fully variational wave function, only the first-order response of the wave function 0 k / 0 x is required to calculate the energy to second order. In particular, for the evaluation of polarizabilities and magnetizabilities, the second-order response of the wave function 02A/0x 2 is not needed. Since we can no longer manage without the first-order response, let us consider its evaluation. We have already noted that the variational conditions Eq. 9 determine the dependence of the wave function on x. Differentiating these conditions with respect to x and applying the chain rule, we obtain dx
Ok
.
=
0x0A
St
+
OA2
OX ~xx--O
(14)
I
Introducing the following notation for the electronic gradient and the electronic Hessian of the optimized wave function
7(x) = a(x)
=
aC ON (~;~)1 . 02r (x; ,x) OA 2
(15)
(16)
we find that Eq. 14 may be written in the form of a set of linear symmetric equations
~; (x)
0x 0x =
-
07(x) o----2-
(:7)
153
These equations are known as the response equations since they determine the first derivatives (i.e., the first-order response) of the wave function to the perturbation. In the response equations Eq. 17, only the differentiated electronic gradient 0~" ( x ) / 0 x on the right-hand side depends on the nature of the perturbation. An analogy with Hooke's law is here helpful: The electronic Hessian ~ (x) plays the role of the force constant and the perturbed gradient - O F ( x ) / 0 x represents the force. For the unperturbed system, the electronic gradient ~" (x) is zero and the wave function is optimal or stable. When the perturbation is turned on, the wave function of the original unperturbed system is no longer stable~the perturbation introduces a "force" - O F ( x ) / 0 x to which the wave function responds by stabilizing itself or "relaxing" by the amount 0,V0x. The "relaxation" 0,~/0x is proportional to the "force" - 0 ~ - ( x ) / 0 x and inversely proportional to the "force constant"
(x). Let us summarize our results so far. We have found that the first-order properties may be calculated according; to the expression dE 0$ dx = 0x
(18)
and the second-order properties from the expressions d2E dx----ff 0)~ G Ox
=
OYr 0), 02g Ox 2 + - -0x - ~ 0x OY:
=
-
(19)
(20)
Ox
where all derivatives have been taken for the optimized wave function at zero field and the arguments have been omitted for clarity. Combining Eq. 19 and Eq. 20, we may write the second-order properties in the more compact form
_ dx 2 - 0x 2
0 7 r a _ , O7 0x
0x
which involves the inverse G-1 of the electronic Hessian G. In general, however, the dimension of the electronic Hessian is so large (because of the large number of electronic parameters in ,k) that the inversion of the Hessian becomes prohibitive. The evaluation process is therefore more faithfully represented by the expressions Eq. 19 and Eq. 20: We first generate the wave-function responses by solving the linear equations Eq. 20 for the perturbations of interest; next, we calculate the second-order properties from the expression Eq. 19.
154 The response equations are usually solved in some iterative manner, in which the explicit construction of G is avoided, being replaced by the repeated construction of matrix-vector products of the form Gv where v is some "trial vector". In general, the solution of one set of response equations is considerably cheaper than the optimization of the wave function itselfi Moreover, since the properties considered in this chapter involves at most three independent perturbations (corresponding to the three Cartesian components of the external field), the solution of the full set of equations needed for the evaluation of the molecular dipole-polarizability and magnetizability tensors is about as expensive as the calculation of the wave function in the first place. We have established that, for a fully variational wave function, we may calculate the first-order properties from the zero-order response of the wave function (i.e., from the unperturbed wave function) and the second-order properties from the first-order response of the wave function. In general, the 2n + 1 rule is obeyed: For fully variational wave functions, the derivatives (i.e., responses) of the wave function to order n determine the derivatives of the energy to order 2n + 1. This means, for instance, that we may calculate the energy to third order with a knowledge of the wave function to first order, but that the calculation of the energy to fourth order requires a knowledge of the wave-function response to second order. Since many of the wave-function models in quantum chemistry are not fully variational, it would seem that the theory presented here is of limited practical interest. Consider the nonvariational energy functional Shy (x; .k). The optimized electronic energy is calculated from the expression E (x) = s
(x; ~,')
(22)
where the optimized parameters ,k* are obtained by solving the linear or nonlinear set of equations e(x;X*) = 0
(23)
In the special case of a variational wave function, these equations correspond to the variational conditions Eq. 9. For nonvariational wave functions, however, the equations are different and the variational conditions are not satisfied:
o~
# o
(24)
155 In calculating first-order molecular properties, we can no longer use the simple Hellmann-Feynman expression Eq. 11 since
dE 0s d,, #
(25)
Instead, it would appear that we must fall back on the more complicated expression Eq. 10
dE
0gnv]
dx =
0•
0s +
0,k
(26)
Ox
which involves the explicit evaluation of the response 0A/0x by the solution of one set of linear equations Eq. 17 for each perturbation. The calculation of first-order properties according to Eq. 26 is clearly an expensive undertaking. The key to solving this problem is to calculate the molecular properties not from the original expression but from a different, variational energy functional, whose optimized energy coincides with the nonvariational energy Eq. 22. If such a functional can be found, then the evaluation of the first-order molecular property from this functional may be carried out in exactly the same manner as for variational wave functions. Obviously, for this strategy to be useful, the construction of the new energy functional must be inexpensive so that what we gain from its use is not lost in its construction. Fortunately, a systematic and inexpensive procedure--namely, Lagrange's method of undetermined multipliers~exists for the construction of variational energy functionals, making this strategy worthwhile in most cases. Let us set up a Lagrangian for the optimization of the energy ,.~e (x; X) subject to the nonvariational constraints Eq. 23. Introducing one multiplier for each constraint, we obtain L (x; ,k, X) _ c ~ . (x; X) -t- XTe (x; X)
(27)
We now require this Lagrangian to be fully variational with respect to all parameters ,k and X by solving the equations =
OL (x; A, X) 0X
=
(x;
-
o
0 g ~ (x; A) xT Oe (x; A) 0,k + 0X = 0
(28)
(29)
Whereas the first set of equations merely represents the nonvariational conditions Eq. 23 and is trivially satisfied, the second set determines the Lagrange
156 multipliers .k and requires the solution of a single set of linear equations. We note that, at , k - ,k* and ~ - ~*, the Lagrangian returns exactly the same energy as does the original energy function (30)
E (x) - E,. (x; ,k') - L (x; ,V, ~ ' )
but it has the additional useful property of being variational. The molecular properties may now be calculated from this Lagrangian as for any fully variational energy [1]. In particular, the first-order properties are obtained as
dE (x) dx
dL (x) =
dx
OL (x) =
0x
=
0g~v (x; ,k) 0x +
~TOe (x; 3,)
(31)
0x
in accordance with the Hellmann-Feynman theorem. Note that the construction of the Lagrangian Eq. 27 from the nonvariational energy function s requires the solution of one set of linear equations Eq. 29, but that no further solution of linear equations is required for the calculation of first-order molecular properties according to Eq. 31, no matter how many properties are considered. In contrast, if no Lagrangian had been set up, we would instead have had to soh'e one set of linear equations Eq. 17 for each independent perturbation in Eq. 26. The results derived for variational wave functions in the present section are therefore quite generally applicable since, by a suitable (and usually trivial) modification of the energy functional, the energy functional of any computational method of electronic-structure theory may be made variational.
Molecular properties in the diagonal representation of the Hamiltonian The expressions derived for the molecular properties in the previous section are of a rather general and perhaps somewhat abstract character. For a given variational wave function, the explicit expressions for the molecular properties are obtained by substituting in Eqs. 18 to 21 the detailed form of the energy functional g (x;,k); for a nonvariational wave function, we first express the energy as a variational Lagrangian and then proceed in the same manner. We shall not discuss the detailed expressions for the derivatives here, referring instead to special reviews [1]. Still, to illustrate the physical contents of Eq. 18 and Eq. 21, we shall now see how these expressions are related to those of standard time-independent perturbation theory.
157
A particularly simple realization of the variational method arises if we make a linear ansatzfor the wave function, expanding the electronic state ]C> in an m-dimensional set of orthonormal antisymmetric N-electron functions li> (e.g., Slater determinants)" m
Ic> - ~ c, !0
(32)
i--1
The energy functional for this state is written as an expectation value
E (c) -
(c ]HIE)
(33)
which depends on the numerical parameters Ci. The electronic gradient and Hessian are given by F~ ( c ) - 0E ( c )
OCi
02E (c) G~r ( c ) -
oc~ocj
(34) (35)
which, upon substitution of Eq. 33 followed by differentiation, lead to
~-i (C)
-
2 [(i [HI C) - g (C) (i [C >]
Gij (C)
-
2 [(i
-
2 [~+ (c) <j IC> + Fj (c) <~ Ic >]
[H[j) -
(36)
g (C) (i IJ)] (37)
In these expressions, we have have assumed that the state IC) is real-valued and normalized
(38)
In the following, we shall first consider the optimization of ]C) and then go on to consider the evaluation of first- and second-order properties from the optimized wave function. From the expression for the electronic gradient Eq. 36, we note that the conditions Eq. 9 for a variational wave function
7~ (c) = 0
(39)
are now equivalent to the requirement (i IHI C> - g (C)(i [C)
(40)
158
In matrix notation, these equations may be written as HC - s (C)C
(41)
where the elements of the Hamiltonian matrix H are given by
H~j - (i IHIj)
(42)
The conditions Eq. 41 represent a standard m-dimensional eigenvalue problem. Since H is Hermitian, the eigenvalue equations have exactly m solutions lit') - E Cig li)
(43)
i=1
which are orthonormal
@L
(44)
E K - g (CK)
(45)
(K IL) -
Moreover, the associated m eigenvalues
are real and may be ordered as (46)
C ,-o _< ~:~ <_"" _< ~r
with CK representing an upper bound to the Kth electronic state. To evaluate the first- and second-order molecular properties, we choose the diagonal representation of the Hamiltonian. In this representation, the electronic energy, the electronic gradient, and the electronic Hessian of the electronic ground state 10) may be written in the following manner
Eo f K --
GKL -
(0 IHI 0) 2 [(K IHI 0) - (0 IHI 0) (K 10)]
(47) (48)
2 (s - s
(49)
Since the electronic Hessian has been evaluated at a stationary point, it no longer contains any gradient-like terms (compare with Eq. 37). Using these expressions in Eq. 18 and Eq. 21, we obtain
ds -
OH > o-s163
d2C
(50)
0
Ox 2
0
2E OH
i)2HI) -
K>O
OH
E K - Eo
159
which we recognize as the standard expressions of time-independent perturbation theory. However, although conceptually simple and transparent, these expressions are not particularly useful for practical calculations of molecular properties since a complete diagonalization of the Hamiltonian matrix is required. In practice, therefore, the molecular properties are calculated in a different manner, which does not require the diagonalization of the Hamiltonian matrix.
4
Explicit expressions for electric and magnetic properties
Having considered the general expressions for first- and second-order molecular properties, we now restrict ourselves to properties associated with the application of static uniform external electric and magnetic fields. For such perturbations, the Ha miltonian operator may be written in the manner (in atomic units) 1 B .dm + -~ E [B2r~- (B .ri) 2]
H (F,B)= H0- F-de
(52)
i
where the summation is over all electrons and where we have introduced the electric and magnetic dipole operators de
=
- Eri i
dm
=
+ EZIRI
(53)
I
1
1
r, X p , i
s
(54)
i
In Eq. 53, the first summation is over all electrons and the second summation over all nuclei. The positions of the electrons are given by r i and the charges and positions of the nuclei by ZI and R I , respectively. In Eq. 54, the Pi and si are the conjugate momenta and spins of the electrons. We have also introduced the operators
i
for the total orbital and spin angular momenta of the electrons. Note that, in the clamped-nuclei approximation, there are no nuclear contributions to the magnetic dipole operator.
160 Having set up the Hamiltonian, we may calculate the first- and secondorder properties in the eigenvector representation. For the permanent electric and magnetic dipole moments, we obtain
-
(0 Idol o) - - ~ (0 Ir, I 0> § ~ ZzRz i
m
-
(0[dm]0)--
(57)
I
0 ~L+S
0
(58)
Whereas the permanent electric dipole moment vanishes for molecules belonging to certain points groups (e.g., for all molecules that possess a center of inversion), the permanent magnetic dipole moment vanishes for all closedshell systems. To see how the vanishing of the magnetic dipole moment comes about, we first note that S [cs) - 0
(59)
since the closed-shell state ]cs) is a singlet. Next, we note that for all realvalued electronic states ]real> (such as all closed-shell states), the expectation value of any imaginary Hermitian operator (such as the orbital angularmomentum operator) is identically equal to zero (real [LI real) -- 0
(60)
and the angular momentum is said to be quenched. Let us now consider the second-order molecular properties. The static electric dipole-polarizability tensor is given by the expression
- 2 ~ (0 lde[ K> (K IdOl 0> K>o E K - E0
(61)
and is nonnegative for the electronic ground state. For the magnetizability tensor, we obtain
r - __14 ~ (01~1 - r,r~l o) + 2 ~ i
K>o
<0 Id,n IK)(K [d~[ 0)
(62)
E K - Eo
The first term is always negative and is referred to as the diamagnetic contribution or the Langevin term. The second, sum-over-states term is known as the paramagnetic contribution and is always positive or zero for ground states.
161
For closed-shell states, there is, according to Eq. ,59, no spin contribution to the paramagnetic part of the magnetizability, the only contribution coming from the orbital motion of the electrons:
I
-- - 4 E
=
I
i
ILl K)
[LTIcs>
E K - Eo
(63)
K>0
Furthermore, for any spherically symmetric closed-shell system, we obtain L lsph-cs) - 0
(64)
since the wave function is an eigenfunction of L 2 of zero angular momentum. For such systems, the paramagnetic contribution to the magnetizability vanishes completely and the isotropic magnetizability~one third of the trace of the magnetizability tensor~may be written in the following simple manner 1
~i~o -- - ~ E
(sph-cs ]r~[ sph-cs)
(65)
i
Thus, for noble-gas atoms, the magnetizability is always negative or diamagnetic. Furthermore, for most closed-shell molecular systems, the paramagnetic contribution to the magnetizability is somewhat smaller (in magnitude) than the diamagnetic contribution (assuming the use of the center of mass as the gauge origin), making the magnetizability diamagnetic (i.e., negative) for nearly all ground-state closed-shell systems. Known exceptions to this rule are BH, CH +, and Sill + [2,3], whose weak paramagnetism is probably related to the near-degeneracy of the a and 7r orbitals of the half-filled valence shell. In electronically excited states, however, the relative magnitude of the diamagnetic and the paramagnetic contributions may change. Thus, theoretical calculations have demonstrated that the B 1Z;+ state in the hydrogen molecule is paramagnetic [4,5]. In this state, the diamagnetic and paramagnetic contributions to the total magnetizability are both larger (in magnitude) than the corresponding contributions in the ground state; the paramagnetism of the B 1E+ state arises since the increase is largest for the paramagnetic contribution. The diamagnetic and paramagnetic contributions to the magnetizability are often significantly larger than the observed diamagnetism of a molecule. In C60, for instance, the diamagnetic contribution is -13073 ppm cgs and the paramagnetic contribution 12714 ppm cgs, leading to a residual overall diamagnetism of only -359 ppm cgs (at the Hartree-Fock level) [6]. Similar observations are made for less exotic molecules, although not to the same
162 extent. For instance, at the Hartree-Fock level, the paramagnetic contribution to the magnetizability in norborna.diene has been calculated to be 368.7 ppm cgs, with a total magnetizability o f - 6 0 . 4 ppm cgs, in good agreement with data extracted from a molecular Zeeman study of this molecule [7]. Because of the nonnegative sum-over-states expression in Eq. 61, the polarizability of an electronic ground state is necessarily positive. It is, however, possible to imagine that, in an excited state, there may exist closelying states with negative energy denominators and large contributions to the sum-over-states expression Eq. 61, leading to an overall negative polarizability. Such negative excited-state polarizabilities have been predicted theoretically [8] but not yet observed experimentally. In contrast, recent experimental results for H2CS indicate that, for this system, one of the components of the paramagnetic magnetizability contribution is negative [9]. Unlike the situation for polarizabilities, this observation is not supported by ab-initio calculations [10].
5
L o n d o n orbitals
There is one important aspect of the evaluation of magnetic properties to which we have not yet alluded: In the presence of a magnetic field, the Hamiltonian operator is not uniquely determined since the gauge origin for the vector potential representing the magnetic field may be chosen in different ways. For an exact solution of the problem (i.e., for an exact calculation of the magnetic properties), the choice of vector potential will not affect the calculated properties. In approximate calculations, on the other hand, the choice of vector potentials may critically affect the quality of the calculated results. In the present section, these problems and their solution are discussed, beginning with a review of the basic theory required for a quantummechanical treatment of an electronic system in an external magnetic field. We then go on to consider the gauge-origin problem in approximate calculations of electronic systems, explaining how this problem is solved through the use of London atomic orbitals. Consider a non relativistic electronic system in the presence of a static external magnetic field induction B. When such a system is treated quantum mechanically, the first step is to set up a vector potential A (r) that fulfils the following two requirements (in the Coulomb gauge): V• V-A(r)
=
B
(66)
=
0
(67)
163
For a uniform external magnetic field, such a potential may be written as 1
Ao (r) - ~B x ( r - O)
(68)
which vanishes at the gauge origin r = O. Next, we proceed to construct the electronic Hamiltonian operator. For a spin-free one-electron system, we obtain 1
2
H (A0) - ~zr + V (r)
(69)
where the k i n e t i c or m e c h a n i c a l m o m e n t u m is given by ~r = - i V
+ Ao (r)
(70)
and where V (r) is the potential. Finally, we calculate the electronic energy, for example by minimizing the expectation value of the energy with respect to the form of the wave function E (B) = (r ( A o ) I H
(Ao)l r (Ao))
assuming that r (Ao) is normalized. The electronic energy E (B) depends on the magnetic field B and must be evaluated for each field strength separately. If, for example, we are interested in the molecular magnetizability rather than in the electronic energy, we may calculate this property from the energy according to Eq. 7, either by some analytical technique or by numerical differentiation. Before continuing our discussion of gauge-origin dependence, we note that the substitution of Eq. 70 in the spin-free Hamiltonian Eq. 69 followed by expansion does not lead to the expression Eq. 52. To account for the missing Zeeman spin interaction, we must first replace the nonrelativistic spin-0 Hamiltonian Eq. 69 with a nonrelativistic spin-g1 Hamiltonian which for a one-electron system is given by H (A0) - ~1(2s. 7r) 2 + V (r) = 17r22+ s - B + V (r)
(72)
Since, for closed-shell states, spin interactions do not contribute to any of our properties, we shall here ignore the contributions from the spin degrees of freedom and use the simpler spin-0 Hamiltonian of Eq. 69. Although the procedure for the introduction of magnetic perturbations outlined above is in principle straightforward, there are some subtleties related to the choice of gauge origin for the vector potential. We note that the
164 vector potential and the Hamiltonian operator are not uniquely defined since we may choose the position of the gauge origin O freely and still satisfy the requirements Eqs. 66 and 67. In contrast, all observable properties of the system--the electronic energy and the magnetizability, for instance--should be independent of the choice of gauge origin. This gauge-origin independence can occur only if the wave function ~b (Ao) changes in a very specific manner as we change the position of the gauge origin. Consider first a general gauge transformation of the vector potential. For any scalar function f (r), the curl of the gradient vanishes identically: v • vy = 0
(73)
Accordingly, we may write a general gauge transformation of the vector potential in the following manner A' (r) = A (r) + V f (r) where
f (r)
(74)
is some scalar function. For such a gauge transformation, the
exact wave function transforms as ~b' (r) = exp [ - i f (r)] ~b (r)
(75)
and gauge invariance of the energy and other properties is maintained. For example, it is easily verified that the following identity holds
(76) where H' is the gauge-transformed Hamiltonian, constructed from A' (r) rather than A (r). It should be realized that, except for a constant overall phase factor, the expression in Eq. 75 constitutes an exact relationship between two wave functions that have been separately and independently determined for two different choices of the vector potential A' (r) and A (r). The two wave functions describe the same physical state since for example 1r (r)[ 2 -- Jr (r)l 2
(77)
but correspond to two different representations of the magnetic field. Let us now consider the particular gauge transformation associated with a shift of the gauge origin from O to O': Ao, (r) = Ao (r) + V f (r)
,(7s)
165 For such a transformation, the scalar function is given as the simple triple product
1 • (o - o') 9r f (r) - ~B
(79)
The gauge-transformed wave function may consequently be written in the manner r ( r ) - exp [ - i l B
• ( O - O ' ) . r] r ( r ) ( 8 0 )
where r (r) and r (r) are the wave functions associated with the gauge origins O' and O, respectively. Having considered gauge transformations of the exact wave function, let us now turn our attention to approximate electronic wave functions. The first thing to note is that, for an approximate wave function expanded in a finite-dimensional variational space, there is no guarantee that the electronic wave function will transform correctly upon a change of the gauge origin~for this to happen, the variational space would have to be sufficiently flexible to reproduce the gauge transformation in Eq. 80 ezactly. Indeed, within a finite linear variational subspace, gauge-origin invariance can never be obtained exactly, only approximately for small displacements of the gauge origin. In such cases, therefore, our calculated energies and properties will depend on our choice of gauge origin. The lack of gauge invariance in approximate calculations gives rise to several problems. First, for the calculation of electronic energies and properties to be reproducible, we must in each case report the position of the gauge origin used in the calculation. In a few cases, a natural choice can be made for the gauge origin (for example, at the nucleus in an atomic calculation, as discussed below) but mostly no such choice can be made, making the results of the calculation rather less definite than would otherwise be the case. Second, the quality of the calculated results may in many cases depend critically on the choice of gauge origin for the vector potential, making the reliable a priori prediction of molecular magnetic properties difficult. Nevertheless, we shall shortly see that, by a suitable modification of the orbitals in which the electronic wave function is expanded, it is possible to develop a computational scheme in which the magnetic properties are obtained unambiguously and in a well-defined manner. Let us begin by examining the notion of a natural gauge origin more closely. Consider a one-electron atomic system in a magnetic field B and assume that the unperturbed system is represented by the approximate wave
166 function Xt,n, which we may take to be a Slater orbital or some other approximate representation of the atomic state. The unperturbed wave function is centered on N (the atomic nucleus) and is assumed to be an eigenfunction of the (effective) Hamiltonian H0 and of the operator for angular momentum along the z direction L g HoXz,n N
=
(81)
EoXzm
The superscript N in L g indicates that the angular momentum is defined relative to N. We now apply the magnetic field with the gauge origin at N: AN ( r ) -
1 ~B x ( r - N)
(83)
Constructing the perturbed Hamiltonian in the usual manner and carrying out some simple algebra, we find that the unperturbed wave function XZ~nis correct to first order in the magnetic field B:
Y0+1BL g + O (8 2)
X,m -
Eo + -~mtB Xt,~ + O (
)
(84)
On the other hand, if we apply the field with a different gauge origin M AM ( r ) -
1 ~B x ( r - M)
(85)
then the unperturbed wave function is correct only to zero order in the field (H0+ 1BLM ~ + 0 2 ( B 2 ) ) X / m - EoXlm+O (B)
(86)
since Xtm is not an eigenfunction of LzM. Thus, we see that our approximate wave function is biased towards N in the sense that, with this choice of gauge origin, the wave function is correct to first order in the field, whereas, with any other choice of gauge origin, it is correct only to zero order. Clearly, for one-electron atomic systems, a natural gauge origin exists. Once a natural gauge origin has been identified, we may use this gauge origin as a reference gauge origin and enforce gauge-origin independence on our description by attaching to the wave function the phase factor associated with a shift of the gauge origin from the reference gauge origin N to the gauge origin O of the external vector potential. We now obtain a wave function of the form
[1
wtm -- exp - ~ i B x ( N - O ) - r
]
Xlm
(87)
167 which is correct to first order in the field B for any choice of gauge origin O, as is easily verified. Moreover, it may readily be shown that the expectation value of the one-electron Hamiltonian E (B) - (wzm ( A o ) [ H (Ao)I ~1~ (Ao))
(88)
is independent of the gauge origin O, always returning the energy that would be obtained with the gauge origin at N. In a sense, we have arrived at an approximate but gauge-origin independent description of the atomic system. It should be understood, however, that our approximate description is gaugeorigin independent by design rather than because of some inherent flexibility in the description. Nevertheless, since our choice of reference gauge-origin was made because it provides a superior description of the electronic system in the field, we would expect our resulting wave function Eq. 87 to do a good job of representing the electronic system in a magnetic field. Up to this point, we have considered only one-electron systems. For wave functions constructed as linear combinations of Slater determinants (i.e., all wave functions except those that explicitly contain the interelectronic distances in the parametrization), we can generalize our procedure to many-electron atomic systems by attaching one complex phase factor to each individual orbital according to Eq. 87. As in the one-electron case, the resulting description of the electronic system becomes independent of the position of the external gauge origin with the atomic center as the reference gauge origin. Again the quality of the description is expected to be good since, with this reference origin, the unperturbed wave function is correct to first order in the magnetic field. Since, for a.n atomic system, we may always place the external gauge origin at the nuclear center, the introduction of the complex phase factor in the atomic orbital according to Eq. 87 may appear to be an academic exercise of no practical value. The importance of the phase factors in the orbitals becomes apparent only when several atoms are considered simultaneously. In such cases, we cannot simultaneously place the external gauge origin O at all the atomic nuclei in the system and the introduction of the complex phase factors therefore becomes essential to ensure a uniform description of the electronic system. Indeed, without the complex phase factors attached, our description of two isolated, nonintera.cting atomic systems would unphysically depend on their relative separation (with the gauge origin chosen somewhere between the two nuclei). In the parlance of electronic-structure theory, we would say that our description is not "size-extensive". We have seen how the introduction of complex phase factors ensures a uniform description of a supersystem containing several nonintera.cting
168 atoms. Let us now consider the evaluation of magnetic properties for molecular systems--that is, for systems of intera.cting atoms. For such systems, we may (in analogy with our treatment of atomic orbitals) attach the complex phase factors to the molecular orbitals (MOs). This approach leads to the method known as individual 9auge for localized orbitals (IGLO), which has been quite successful for calculations of nuclear shieldings and magnetizabilities [11,12]. One complication with this approach is that the MOs are in general not spherically symmetric and there may often be no natural reference gauge origin for the description of the magnetic perturbation: For the (almost) spherically symmetric, localized core orbitals, we may choose the reference gauge origin of the MOs to coincide with the atomic nucleus; for the delocalized valence orbitals, there may be no preferred position for the reference gauge origin. To alleviate this problem, the MOs are usually localized prior to the calculation of the magnetic properties. An alternative solution to the gauge-independence problem in molecular calculations is to attach the complex phase factors directly to the atomic basis functions or atomic orbitals (AOs) rather than to the MOs. Thus, each basis function--which in modern calculations usually corresponds to a Gaussian-type orbital (GTO)~is equipped with a complex phase factor according to Eq. 87. A spherical-harmonic GTO may then be written ill the from
where Stm (r) is a standard solid-harmonic function. The resulting atomic orbitals are known as London atomic orbitals or gauge-independent atomic orbitals (GIAOs). London orbitals were introduced by Fritz London, who in 1937 used Hiickel theory to calcula.te the contribution to the magnetizability from the ring currents in the ~-orbital backbone of some aromatic molecules [13]. The great virtue of London's approach is that each individual AO--the building blocks of molecular wave functions--has been "harnessed" to respond correctly (to first order at least) to the application of an external magnetic field, irrespective of the choice of the external gauge origin. Moreover, since, in London's approach, only the atomic orbitals are modified, this method is fully transparent to the treatment of the electronic structure otherwise. To illustrate the remarkable improvements in basis-set convergence that are obtained through the use of London orbitals, we have in Figure 1 plotted the isotropic magnetizability for the phosphorous trifluoride molecule (PF3) calculated with and without the use of London orbitals [14]. In these
169 Figure 1" The isotropic magnetizability (in ppm cgs) of PF3 calculated with and without the use of London orbitals -30_~. . . . . . . . . . . . . . . . . . -40
+ ..................
+ ..................
+
"
-50
00I -70 ~ . /
-80 T D
field-indep;::d:: ::bbltt:l:
-
~ ! T Q Basis set (aug-cc-pVXZ)
t 5
calculations, we have used the augmented correlation-consistent basis sets of Dunning and Woon [15, 16], the largest of which (aug-cc-pV5Z) contains 512 basis functions. The figure is quite striking. At the double-zeta level, the magnetizability obtained without London orbitals (i.e., with field-independent orbitals) is in error by a factor of almost 2.5. Even in the large aug-cc-pV5Z basis, the magnetizability is off the Hartree--Fock limit by as much as 15%. In contrast, the magnetizabilities obtained using London orbitMs are all within 3% of the result in the largest basis, even at the aug-cc-pVDZ level. These calculations clearly demonstrate the importance of using properly adapted (field-dependent) basis functions for the calculation of magnetic properties. Indeed, for large systems such as C60, the use of London orbitals is mandatory since otherwise the basis set needed for a reliable calculation of the magnetizability would become unmanageable.
170
6
The calculation of molecular magnetizabilitiesComparison with experiment
The last systematic measurements carried out for the diamagnetic magnetizability of gaseous substances are those undertaken by Barter, Meisenheimer, and Stevensen in 1960. Almost forty years later, these measurements still represent the best experimental d a t a for the isotropic diamagnetic magnetizability of gas-phase molecules. 1 With the recent extension of London's method to the calculation of magnetizabilities at the ab-initio level, it would be interesting to see how these gas-phase measurements compare with modern theoretical calculations, especially in view of the difficulties that beset the experimental measurements of isotropic diamagnetic magnetizabilities: (1) the smallness of the effect, which implies that even slight traces of oxygen (a strongly paramagnetic substance) in the probe can severely limit the accuracy of the measurements; and (2) the sensitivity of the results to the quality of the calibration standard employed in the investigation. Using London atomic orbitals at the Hartree-Fock level, we have investigated the isotropic magnetizability for a variety of saturated and unsaturated hydrocarbons, including cyclic and aromatic systems. For the normal alkanes and cycloalkanes, we used the aug-cc-pVDZ basis of Dunning and Woon [15, 16]. For the smaller molecules, a slightly smaller set was used, based on the aug-cc-pVDZ basis but with the most diffuse s and d orbitals on carbon removed as described in Ref. [18]. For all molecules, the geometry was optimized in the same basis as used for the calculation of the magnetizability, allowing us to compare the experimental values with purely theoretical results, see Table 1. In view of the problems encountered when magnetizabilities are calculated without London orbitals [14], the agreement with experiment is quite remarkable. Still, most of the calculated numbers are about 7% too diamagnetic. This discrepancy was first ascribed to the neglect of electron correlation [18]. However, accurate correlated calculations on the magnetizability of the noble-gas atoms (using Eel. 65) then revealed that the calibration standard chosen by Barter et al. (the magnetizability of the argon atom) was in error and that, to correct for this error, the experimental results should be scaled by a factor of 1.07. In Table 1, the scaled experimental results are listed in parentheses (next to the unsca.led experimental numbers); 1This statement applies only to the isotropic magnetizability; the magnetizability anisotropy can be very accurately determined from microwave measurements, see for example the review by Sutter and Flygare [17].
171 Table 1" The isotropic magnetizability (in ppm cgs) of hydrocarbons calculated using ab-initio methods and (where available) from experimental gas-phase measurements. The theoretical results have been taken from Refs. [18, 19]; the experimental results have been taken from the work of Barter, Meisenheimer, and Stevenson [20]. Pascal's rule (theory) Experiment Molecule Theory -23.2 -17.4-t-0.8 (-18.6) Methane -19.0 -30.2 Ethane -29.7 -26.8:t::0.8 (-28.7) -23.2 Ethene -21.5 - 8.8+0.8 -16.1 Ethyne -23.3 -41.8 Propane -41.8 -38.6-1-0.8 (-41.3) -34.8 Propene -33.2 -30.7-1-0.8 (-32.8) -34.8 Cyclopropane -42.4 -39.2=t=0.8 (-41.9) -27.7 Cyclopropene -29.0 -27.7 Allene -29.1 -25.3-I-0.8 (-27.0) -27.7 Propyne -34.6 -53.4 n-butane -53.7 - 0.3+0.8 -46.4 Trans-2-butene -44.7 Cis-2-butene -45.0 -46.4 -39.3 s-trans- 1,3-butadiene -37.0 -46.4 Cyclobutane -45.4 -40.0-t-0.8 (-42.8) -39.3 Cyclobutene -36.9 -32.2 Cyclobu tad iene - 16.5 -65.0 n-pentane -65.6 -61.5-1-0.8 (-65.8) -58.0 Cyclope ntan e -61.4 -56.2:[:0.8 (-60.1) -50.9 Cyclo pen tene -52.0 -43.8 Cyclope n tad ie ne -47.4 -76.6 n-hexane -77.5 -69.5 Cyclohexane -69.7 -55.4 1,3-Cyclohexadiene -53.7 -55.4 1,4-Cyclohexadiene -50.9 -88.2 n-Heptane -89.4 -81.1 Cycloheptane -81.9 -99.8 Octane -101.3 -92.7 Cyclooctane -93.4 -111.4 n-Nonane -113.2 -104.3 Cyclononane -104.9 -123.0 n-Decane -125.1 -115.9 Cyclodecane -116.7
172 the scaling leads to an almost perfect agreement between experiment and ab-initio theory. From a theoretical point of view, the relative insensitivity of the magnetizability to electron correlation--as documented in a number of investigations [21-24]--is gratifying since it allows us to investigate larger molecules at the Hartree-Fock level with some degree of confidence. Again, we would like to emphasize that the relatively high degree of agreement with experiment obtained at the Hartree-Fock level is attained only by the use of explicitly gauge-transformed orbitals (GIAO or IGLO). Nevertheless, even at the GIAO (or IGLO) level, the Hartree-Fock method does not always provide sufficient accuracy to refute experimental observations. For this purpose, electron-correlation effects [21-24] as well as rovibrational effects [22,25] must be accounted for.
7'
P a s c a l ' s rule and a-ring c u r r e n t s
The first experimental observation of the additivity of the molecular magnetizability was made by Henrichsen in 1888 [26]. Henrichsen showed that, for a large number of compounds, the addition of a single methylene unit changes the magnetizability of the compounds by about -11.1 ppm cgs, independently of the functional groups otherwise present in the system. 2 Later, this additivity was explored by Pascal and Pa.cault [28,29]. Pascal developed his additivity scheme in terms of average atomic additivity constants (Pascal~ rule); several researchers have since explored this concept and developed more refined systems, introducing functional-unit magnetizabilities as well as additivity schemes for the individual components of the magnetizability [30, 31]. Owing to its simplicity and surprising accuracy, the additivity scheme is widely used for estimating isotropic magnetizabilities. In NMR experiments, for example, it is often important to have a knowledge of the bulk magnetizability to estimate local magnetic-field effects on the observed shieldings; in microwave Zeeman spectroscopy, the additivity scheme is routinely used to extract the full set of magnetizability components (since only the anisotropies can be obtained from the experiment). Furthermore, one of the most successful schemes for determining the aromatic character of molecular 2Henrichsen measured the magnetizabilities relative to water. The magnitude of his methylene magnetizability therefore depends on the value chosen for the magnetizability of water. If we use the measurement on ice [27], the methylene magnetizability is -11.1 ppm cgs; if we use the experimental gas-phase value [27], it becomes -11.9 ppm cgs; and if we use the theoretical gas-phase value [24], it becomes -12.9 ppm cgs.
173 systems is based on the predictive power of the magnetizability additivity schemes [32]. Theory can here play an important role. Experimental magnetizabilities are subject to intermolecular interactions as well as to the effects of molecular tumbling and conformational averages, effects that must be carefully accounted for when attempts are ma.de at reproducing (or predicting) experimental numbers. On the other hand, like experimental noise, such effects may also interfere with our study of additivity as such, obscuring and blurring its finer details and patterns. The same effects do not enter ab-initio theory, however, making it a useful complementary tool for investigating and exploring additivity schemes. In this context, we note that the theoretical study of additivity schemes such as Pascal's rule requires a computational procedure that can be applied equally well to small and large systems. This requirement is fulfilled only by the GIAO and IGLO methods: In any method based on the use of a global gauge origin, the description of the individual subunits deteriorates as the system increases. Combined with its rapid basis-set convergence, this "size-extensivity" of GIAO Hartree-Fock theory makes it an excellent testing ground for additivity schemes, free from environmental effects and experimental noise. Fitting the calculated data in Table 1 to a two-parameter model (ignoring the smallest highly strained hydrocarbon rings containing three and four carbon atoms), we obtain -4.53 and -3.53 ppm cgs for the carbon and hydrogen atomic magnetizabilities, respectively. 3 In Table 1, we have listed also the magnetizabilities predicted from this two-parameter model. The two parameters do a fairly good job of reproducing the ab-initio results, although some large deviations do appear. In particular, we note that a two-parameter model is unable to distinguish between the various isomers of molecules of the same atomic constitution. As an example, the propene and cyclopropane molecules contain the same atoms (C3H6) but have very different magnetizabilities: -33.2 and -42.4 ppm cgs, respectively (at the GIAO Hartree-Fock level). Considering the very different bonding situations in these two systems, such a large difference in the magnetizability is perhaps not too surprising. In some respects, our sample of molecules is not sufficiently representative for building a set of additivity parameters, having a strong bias towards small molecules, where we would expect special bonding situationsNdouble bonds, for i n s t a n c e u t o dominate the magnetizability to a greater extent than in larger molecules. If instead we restrict ourselves to the study of a 3These values differ slightly from those previously reported by us [18], mainly because of the larger number of molecules included in the present fit.
174
Figure 2: Differential methylene magnetizability for homologous alkanes (in ppm cgs)
~.-
-11.90 , ,,,,,q
..........
J
X
/" . . . . . . . . . . . .
t~
\
....
~
",/," . . . . . . . . . . . . . . . . . .
~
..--%
\ "M-"
:"':"" " "
"
/
"~ -11.95 / E ~
=
/
-12.00
/
t9
/ ~ ,,-,,
-12.05
/
-12.10
i
" g
§
1'o
Number of carbon atoms in the longer chain
simpler set of hydrocarbons such as the normal alkanes, we would expect a two-parameter model to do an even better job at representing the calculated magnetizabilities. Indeed, restricting ourselves to the n-alkanes in Table 1 and considering the change in the magnetizability as we go from one n-alkane to the next, we find that the magnetizability associated with the added methylene group--the differential methylene rnagnetizability--differs by less than 2% for all molecules, as illustrated in Figure 2. In this study, we have ignored the change in the magnetizability from methane to ethane since this modification changes the number of terminal carbon atoms. If we ignore the ethane-to-propane shift as well, the variations become less than 0.5%, demonstrating the almost perfect Mditivity of the magnetizability for unstrained, saturated systems. In similar fits based on ezperimental data (for methane to hexane in the gas phase and for hexane to decane in the liquid phase as reported in Table 1), this near-perfect additivity of the magnetizability is blurred by molecular motion and experimental noise. An interesting phenomenon is observed when the ma.gnetizability corn-
:
175 Figure 3" Differential methylene magnetizability components for homologous alkanes (in ppm cgs). Note that the scale of the abscissa differs from that of Figure 2.
t%
-11.7 .,-q
E
-11.9
........
t9 r
/ ........................................... /.
.,-q
~
--
! (r
~
2
~
o-'/
-12.1
,..d
-12.3
g
i
g
g
-;
g
1'o
Number of carbon atoms in the longer chain
ponents are plotted as in Figure 3, where the notation for the Cartesian components is such that the molecular axis coincides with the x axis and the carbon chain is located in the xy plane. The figure shows that the main contribution to the (weak) oscillations of the isotropic magnetizability comes from the out-of-plane ~zz component and that the average of 1 the two in-plane components 5((== + ~cy) converges more smoothly. The dashed line represents an idealized methylene magnetizability, corresponding to the converged differential methylene magnetizability of the n-alkanes. The methylene magnetizability has here been taken as the difference in the magnetizabilities of decane and nonane. According to Figure 2, this value should be close to the converged methylene magnetizability. Turning our attention to the cycloalkanes, we would naively expect (on the basis of the additivity rules) that the magnetizability of these compounds should be expressed as multiples of the methylene magnetizability extracted
+ r
176 Figure 4" Isotropic magnetizability per methylene unit in cycloalkanes (in ppm cgs)
-11.0 ,, ,-.,4
.
.
.
.
.
.
.
.
.
.
.
/
-12.0 /
I .
.
.
\ .
/o-----o.
.
.
.
.
\
.
/
.
.
.
.
.
.
.
.
.
-- + .
.
.
.
.
.
.
-- - ~ .
.
.
.
.
.
.
.
-- ~ .
.
.
.
.
.
.
/
V
-13.0 o ...., ..--.,, ,, ..-.,
..Q
/
o.-,,
-14.0
f
1'0
Number of carbon atoms in the ring
from the n-alkanes. However, in experimental gas-phase measurements, this was not observed to be the case for the rings from cyclopropane to cyclohexane [20]. Instead, the average methylene magnetizability oscillates strongly from cyclopropane to cyclohexane. Moreover, the oscillations do not occur about the average methylene magnetizability (as predicted from measurements on the n-alkanes) but about a somewhat less diamagnetic value. In this context, two questions arise: What is the cause of these oscillations and why does the average methylene magnetizability not converge to the value predicted from the n-alkanes? Further experimental studies (on larger rings) are difficult to perform because of the low volatility of these compounds and it thus appears difficult to answer these questions through experiment. In Figure 4, we have plotted the calculated average methylene magnetizability in the cycloalkanes, with the dashed line representing the methylene magnetizability of the n-alkanes. From cyclopropane to cyclohexane, we observe the same oscillatory pattern as in experiment; beyond cyclohexane, the curve flattens out and only weak oscillations persist. Just
177 Figure 5: Isotropic magnetizability per methylene unit in cycloalkanes (in ppm cgs). Note that the scale of the abscissa differs from that of Figure 4.
"~
/ \
-10.0
09
/
Q9
~9
-12.0
........
/
\
/ .......
\
/ \.
2~--~----~--"~--
/2_ _~-
~~-~~-
-~ -4-
~~-=--
~,
~• ~ll
/
!
-14.0 !
t~
! t~0
:~
-16.0
!
Number of carbon atoms in the ring
as interestingly, the average methylene magnetizability does not converge to the value predicted from the n-alkanes. As for the n-alkanes, useful information is obtained by comparing the different components of the magnetizability tensor. In the cycloalkanes, one of the principal directions of the magnetizability tensor is normal to the ring plane. In Figure 5, this component is denoted by ~r177and the average of the remaining in-plane components by ~r The plot shows that the oscillations in the isotropic magnetizability arises from ~r alone and that ~ll converges nicely to the methylene magnetizability of the n-alkanes. It thus appears that the anomalies of the magnetizability of the cycloalkanes are related to the presence of weak ring currents, induced by the component of the field perpendicular to the ring. As discussed by Kutzelnigg et al. in the context of aromatic systems, such a-bond currents result from a balance between paramagnetic ring currents inside the ring and diamagnetic currents outside the ring [33]. This situation can be understood by picturing diamagnetic
178 currents around the individual atoms, leading to currents on the inside and the outside of the ring. Concerning the oscillations, we note that cyclohexane (where the oscillations disappear) is the smallest unstrained cycloalkane. Although we would have to go all the way to C14H28 to reach the next unstrained cycloalkane, the strain in the larger rings is significantly lower than in the smaller ones such as cyclopropane and cyclobutane. Apparently, the oscillations in the magnetizability component ~r177reflect the strain in the cycloalkanes. The above study of saturated hydrocarbons illustrates how theoretical calculations may be used to extract information unattainable by experiment alone. From this study, we also conclude that caution should be exercised in applying Pascal's rule to highly strained ring systems even when no 1relectrons are present. In the next section, we shall turn our attention to molecular systems containing the more loosely attached ~r-electrons.
A r o m a t i c m o l e c u l e s and 7r-bond c u r r e n t s In the field of molecular magnetic properties, aromatic molecules have for a long time occupied a special place because of tile elusive notion of ring currents in the 7r-orbital backbone, giving rise to a diamagnetic exaltation that cannot be accounted for in an atom-based additivity scheme. Indeed, Schleyer and Jiao recently suggested that the diamagnetic exaltation of the magnetizability is the only unique way of classifying the aromatic character of molecules [32] and much theoretical work has been devoted to finding theoretical evidence for such ring currents [34, 35]. Linus Pauling introduced a simple scheme for calculating the magnetizability of aromatic molecules. In Pauling's scheme [36], the magnetizability is calculated as the sum of an ordinary atom-based additivity t e r m (~add) and a term that takes into account the effects of ring currents: ~r
_ -38.0 x kay
ppm cgs
(90)
Here a represents the electron density per carbon-carbon bond, k is the strength of the induced magnetic dipoles in a given polyaromatic hydrocarbon relative to that in benzene, and f is a factor that corrects for the assumption that the electrons move in rectilinear segments, chosen by Pauling to be 1.23. With this approach, the magnetizability of almost any aromatic molecule may be calculated on the "back of an envelope" as ~ = ~add+ ~ring. Pauling's approach may be contrasted with the use of modern ab-initio methods, where the calculation on large polyaromatic hydrocarbons still
179 Table 2: The isotropic magnetizability (in ppm cgs) of some polyaromatic hydrocarbons as obtained with Pauling's model [36] and using ab-initio methods. The experimental results given by Pauling are also listed. The numbers in parentheses are obtained by scaling the experimental results by a factor of 1.1 (see Section 8). All calculations were made with idealized C - C bond distances of 1.42~ and C - H bond distances of 1.07~. Molecule Chemical formula Pauling Experiment This work Benzene C6H6 -55 -55 (-61) -60.2 Naphthalene C10Hs -96 -90 (-99) -101.1 Anthracene Ca4H10 -136 -124 (-136) -144.2 Naphthacene ClsH12 -176 -181.8 Pyrene C16H10 -171 -155 (-171) -164.7 Coronene C24Ha2 -323 -293.3
remains a challenge. Indeed, the results reported in Table 2 have been obtained using massively parallel computers. The large computational cost notwithstanding, we have still not accounted for the effects of electron correlation and our results thus represent only the Hartree-Fock limit for benzene, naphthalene, anthracene, naphthacene, pyrene, and coronene. Still, there is no reason to believe that these molecules should exhibit particularly large correlation contributions to the magnetizability and we believe these results to represent rather accurate estimates (i.e., within 5% ) of the gas-phase magnetizability of these polyaromatic hydrocarbons. Considering the simplicity of Pauling's model, the agreement in Table 2 is striking: Our results agree with those of Pauling to better than 15%. However, the experimental results in Table 2 are obtained from liquid or solidphase measurements. As discussed elsewhere, the effects of intermolecular interactions lead to a paramagnetic shift of about 10% [7, 37]. In parentheses, we have therefore given the estimated experimental gas-phase results, obtained by scaling the experimental numbers by a factor of 1.1. Although this scaling lea.ds to an improved agreement with our results, the agreement with Pauling's results deteriorates because of the implicit account of intermolecular effects in the additivity contribution ~add.
The polarizability of normal- and cyclo-alkanes The a dditivity of the molecular dipole polarizability was established already in the 1920s. However, whereas the magnetizability could easily be ratio-
180 Figure 6" Difference in isotropic polarizability between homologous alkanes. All polarizabilities reported in atomic units. >,
12.00
.,-~
t~
Y
.,-q
O
11.75
J
J
0 0
=
11./50
~
11.25
f
Number of carbon atoms in the longer chain nalized in terms of simple atomic magnetizabilities, more elaborate schemes were required for the polarizability. Thus, the first schemes employed bond polarizabilities [38, 39]; later, a group-polarizability scheme such as that outlined above for the methylene subunit was developed by Vogel [40]. For an overview of additivity schemes for molecular polarizabilities, we refer to a recent review by Bonin and Kadar-Kallen [41]. In Figure 6, we have plotted the differential rnethylene polarizability (the isotropic polarizability per methylene unit) of the n-alkanes. The most notable feature of this plot is the increase in the isotropic polarizability with the number of carbon atoms. From the decomposition of the polarizability along the principal directions in Figure 7 (with the Cartesian directions defined as for Figure 3), we note that the differential polarizability increases only in the z direction--that is, in the component parallel to the carbon chain. In the other two directions, the polarizability converges smoothly except for weak oscillations in the in-plane auu component, oppositely directed to those in the other in-plane component a==. These oscillations, which are absent in the isotropic polarizability, may be linked to the alter-
181
Figure 7: Difference in polarizability between homologous n-alkanes
17.0
Oexx
4.a
28 t,q ~
J
15.0
M
O
13.0
~
e/
co
%
~9 .,.~
11.0 Olyy 9---
9.0
~3
]4
5
--0--
6
--
(l'z z = C
"}
--
--e--
8
--
= 0
9
--
--e
1'0
Number of carbon atoms in the longer chain
nation between the C2v and C2h point-group symmetries in the homologous alkane series. The behavior of the polarizability of the n-alkanes may be rationalized with reference to the simple particle-in-a-box model. In this model, the polarizability increases as the fourth power in the length of the box, which implies that the polarizability per unit length (i.e., the differential polarizability) increases as the third power in the length of the box. Obviously, in the n-alka.nes, the electrons do not at all behave according to this simple model, but we may still attribute the increasing differential polarizability of the n-alkanes to a more pronounced delocalization of the electrons in the longer chains, as the electrons become more loosely attached to the system. As the chain grows, this effect becomes less important and the differential polarizability becomes constant. If, in the n-alkanes, there is a small but noticeable increase in the differential polarizability with the chain length, we would expect this effect to be more pronounced in conjugated systems, where the delocalized 1r-electrons
182 Figure 8: Average polarizability per polyene units (C~H~)~ in atomic units
120 f
r r
..-O
O~xx/n
X
O J
80 ..o /
J 9~
.._.-
.__
--.0
40
O P...,
1
3
~i
+
~" 9 1'1 Number of polyene units, n
1'3
1'5
behave more like particles in a box. In Figure 8, we have, for the polyenes, plotted the differential isotropic polarizability and its component along the molecular axis as a function of the number of polyene units (C2H2)~ as obtained in a recent study by Luo et al. [42]. (In the plot, the hump signals a switch from optimized geometries for the smaller polyenes to idealized polymer geometries for the larger polyenes.) Comparing with Figure 6, we note that the differential polarizability increases in a much more dramatic way in the polyenes. However, for this series also, the differential curve levels out and at some stage we would expect the curve to become flat. Indeed, such a saturation is well documented experimentally and theoretically [43,44]. We have seen that, in the unstrained cycloalkanes, the methylene magnetizability is somewhat less diamagnetic (i.e., numerically smaller) than predicted from the n-alkanes. A similar situation occurs for the polarizability. Thus, comparing the differential methylene polarizability of the n-alkanes in Figure 6 with the average methylene polarizability of the cycloalkanes in Figure 9, we note that, except for cyclopropane, the average methylene polarizability of the cycloalkanes is always smaller than that of the n-alkanes. Apparently, in the n-alkanes, the valence electrons are more delocalized and
183
Figure 9" Average methylene polarizability for the cycloalkanes 11.75
o..
~9 r
11.50 09
\
E
\
r
11.25 t~
t~ O
11.00
:i
,i ....
!
5
~
~
.
.
~
.
.
.
~
i
10
Number of carbon atoms in the ring more loosely attached than in the cycloalkanes. SVe also note that, in the cycloalkanes, the differential polarizability decreases rather than increases with the number of carbon atoms. It turns out that this decrease in the differential polarizability may be attributed to the component perpendicular to the molecular plane, which decreases in magnitude as the number of carbon atoms increases and the ring becomes flatter, see Figure 7.
10
The polarizability of polyaromatic hydrocarbons
In view of the discussion of magnetizabilities and polarizabilities we have presented for the alkanes in the previous sections, let us finally compare the polarizability and magnetizability of some polyaromatic hydrocarbons. In Table 3, the polarizability and magnetizability components are listed for a representative set of aromatic systems. Not surprisingly, all three components increase with the size of the system. For the polarizability, the increase is most pronounced along the x axis--that is, along the main
184 Figure 10" Methylene polarizability for the cycloalkanes
13.0
12.0 "-e--
---
"o--
m
+
--
--O-
---
"--o--
m
+
m
--O ~
--"
--o..
Q)
E Q)
11.0
~
tq ~
10.0
O~•
0
9.0
4
5
6
~"
8
9
1'0
Number of carbon atoms in the ring
molecular axis. In contrast, for the magnetizability, the increase is strongest for the component perpendicular to the molecular plane (the z axis). These observations are just what we would expect from the simple pictures based on the particle--in-the-box model (for the polarizability) and the notion of induced diamagnetic ring currents (for the magnetizability).
Conclusions In the present paper, we have discussed the ab-initio evaluation of the static polarizabilities and magnetizabilities of molecular systems, with emphasis on the principles underlying such calculations. With the recent widespread availability of powerful computers, these second-order molecular properties may nowadays be calculated a priori for large molecular systems, allowing us to explore, for instance, the relationship between the properties and molecular structure. Such calculations complement the experimental work in the area and may help in reassessing and improving on the empirical schemes
185 Table 3: The magnetizability drocarbons as obtained using reported in atomic units, and are confined to the zy plane, molecule. Molecule a,~ c~yy Benzene 80.8 80.8 Naphthalene 168.1 123.3 Anthracene 284.3 170.9 Naphthacene 425.8 225.6 Pyrene 303.4 204.8 Coronene 378.3 378.3
and polarizability of some polyaromatic hyhigh-level ab-initio methods. Polarizabilities magnetizabilities in ppm cgs. The molecules the z direction along the longest side of the
a~ 46.7 68.6 90.0 111.3 97.1 131.8
G~ -7.92 -12.6 -17.3 -21.9 -18.4 -118.0
(yy -7.92 -11.3 -14.6 -18.1 -16.7 -118.0
~z -22.14 -39.9 -59.2 -74.8 -68.8 -644.0
developed since the late 19th century for the prediction of molecular polarizabilities and magnetizabilities from molecular structure. Among the pioneers and early practitioners of such schemes, Linus Pauling is the most prominent. In comparison with today's high-level nonempirical approach, these simple schemes are strikingly successful and constitute an integral part of Pauling's legacy to modern chemistry.
Acknowledgments This work has received support from The Research Council of Norway (Programme for Supercomputing), NSC at LinkSping University (Sweden), San Diego Supercomputer Center and Cornell Theory Center through grants of computing time, and financial support from the National Science Foundation through Grant Nos. CHE-9320718 and CHE-9700627 to PRT.
References [1] T. Helgaker and P. Jtrgensen. volume 19 of Adv. Quantum Chem., page 183. Academic Press Ltd., (1988). [2] S. P. A. Sauer, T. Enevoldsen, and J. Oddershede. J. Chem. Phys., 98, 9748, (1993). [3] K. Ruud, T. Helgaker, K. L. Bak, P. Jorgensen, and J. Olsen. Chem. Phys., 195,157, (1995).
186 [4] J. Rychlewski and W. T. Raynes. Mol. Phys., 50, 1335, (1983). [5] T. Helgaker, M. Jaszuriski, and K. Ruud. Pol. J. Chem., accepted. [6] K. Ruud, H./~gren, T. Helgaker, P. Dahle, H. Koch, and P. R. Taylor. Chem. Phys. Lett., accepted. [7] K. Voges, D. H. Sutter, K. Ruud, and T. Helgaker. Z. Naturforsch. A, submitted. [8] D. Jonsson, P. Norman, and H. ~gren. Chem. Phys., 224, 201, (1997). [9] T. Weber and W. Hiittner. Chem. Phys., 179,487, (1994). [10] M. Jaszuriski, K. Ruud, and T. Helgaker. Unpublished results. [!1] W. Kutzelnigg. Isr. J. Chem., 19, 193, (1980). [12] M. Schindler and W. Kutzelnigg. J. Chem. Phys., 76, 1919, (1982). [13] F. London. J. Phys. Radium, 8,397, (1937). [14] K. Ruud and T. Helgaker. Chem. Phys. Lett., 264, 17, (1997). [15] W. H. Dunning Jr. J. Chem. Phys., 90, 1007, (1989). [16] D. E. Woon and T. H. Dunning Jr. J. Chem. Phys., 100, 2975, (1994). [17] D. H. Sutter and W. H. Flygare. Top. Curr. Chem., 63, 91, (1976). [18] K. Ruud, H. Skaane, T. Helgaker, K. L. Bak, and P. J0rgensen. J. Am. Chem. Soc., 116, 10135, (1994). [19] P. Dahle. Master thesis, University of Oslo (1996). [20] C. Barter, R. G. Meisenheimer, and D. P. Stevenson. J. Phys. Chem., 64, 1312, (1960). [21] Ch. van Wiillen. PhD thesis, Ruhr-Universit~t Bochum, (1992). [22] S. M. Cybulski and D. M. Bishop. J. Chem. Phys., 100, 2019, (1994). [23] S. M. Cybulski and D. M. Bishop. J. Chem. Phys., 106, 4082, (1997). [24] K. Ruud, T. Helgaker, and P. J0rgensen. J. Chem. Phys., 107, 10599, (1997).
187 [25] K. Ruud, P. O. /~strand, T. Helgaker, and K. V. Mikkelsen. J. Mol. Struct., (THEOCHEM), 388,231, (1996). [26] S. Henrichsen. Wied. Ann., 34, 180, (1888). [27] Handbook of Chemistry and Physics, 64th ed. (CRC, Boca Raton, 1984). [28] P. Pascal. Ann. Chim. Phys., 19, 5, (1910). [29] P. Pacault. Rev. Sci., 86, 38, (1948). [30] W. Haberditzl. Angew. Chem. Int. Ed. Engl., 5,288, (1966). [31] W. Haberditzl. In L. N. Mulay and E. A. Boudreaux, editors, Theory and Applications of Molecular Diamagnetism, page 59. Wiley, New York, (1976). [32] P. v. a. Schleyer and H. Jiao. Pure and Appl. Chem., 68,209, (1996). [33] W. Kutzelnigg, Ch. van Wiillen, U. Fleischer, R. Franke, and T. v. Mourik. In J. A. Tossell, editor, Nuclear Magnetic Shieldings and Molecular Structure. NATO ASI series, Plenum, (1993). [34] U. Fleischer, W. Kutzelnigg, P. Lazzeretti, and V. MiJhlenkamp. J. Am. Chem. Soc., 116, 5298, (1994). [35] P. W. Fowler and E. Steiner. J. Phys. Chem. A, 101, 1409, (1997). [36] L. Pauling. J. Chem. Phys., 4, 673, (1936). [37] K. Ruud, D. H. Sutter, and T. Helgaker. Unpublished results. [38] A. L. von Steiger. Berichte, 54, 1381, (1921). [39] C. Smyth. Phil. Mag., 50,361, (1925). [40] A. Vogel. J. Chem. Soc., page 1833, (1948). [41] K. D. Bonin and M. A. Kadar-Kallen. Int. J. Modern Phys.B, 8, 3313, (1994). [42] Y. Luo, H. Agren, H. Koch, P. J0rgensen, and T. He]gaker. Phys. Rev. B, 51, 14949, (1995). [43] H. Thienpont, G. L. J. A. Rikken, E. W. Meijer, W. ten Hoeve, and H. Wynberg. Phys. Rev. Lett., 65,214, (1980).
188 [44] J. L. Br~das, C. Adant, P. Tackx, and A. Persoons. Chem. Rev., 94, 243, (1994).
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modem Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
189
The Concept of Electronegativity of Atoms in Molecules Juergen
Hinze
F a k u l t ~ i t fiir C h e m i e , U n i v e r s i t s
1
B i e l e f e l d , 33615 B i e l e f e l d
Introduction
The enormous number of different chemical species with their multitude of properties can be dassified into various categories, provided some ordering principles are discovered and used to associate the various species with different classes. It is with this in mind that already in 1835 Berzelius[1] introduced a classification of chemical species according to a scale of electronegative and electropositive character. He wrote:
"Bei Beschreibung der einfachen Kb'rper ist es eine sehr grofle Erleichterung fiir das Gediichtnis, sie, nach irgendeinem Principe, in mehrere Klassen einzuteilen. Die Griinde fiir eine solche Einteilung kiinnen mehrfach sein. Die Wahl beruht auf dem Endzwecke, den man beabsichtigt... Der Sauerstoff geht immer und ohne A usnahme zu dem positiven, und Kalium immer und ohne A usnahme zu dem negativen Pole (in der Elektrolyse einer Fliissigkeit). Dies gibt uns eine Art an, die Kgrper zwischen diesen beiden Extremen zu ordnen, je nachdem sie bei Zersetzung ihrer Verbindungen hiiufiger zu dem einen als zu dem anderen Pole gehen, und wodurch eine Reihe zwischen Sauerstoff und Kalium entsteht, in welcher die Kb'rper der einen HSlfle elektronegative Kb'rper..., und die andere HSlfte elektropositive.., genannt werden kgnnen; und diese Reihe bildet, richtig aufgestellt, die Basis fiir die Chemie, als ein wissenschaflliches System von Tatsachen und deren Ursachen." This translates literally into:
"In the description of the simple bodies, it is a great alleviation to the memory to divide them according to some principle into several classes. The foundations for such a classification can be varied. The choice rests on the aim desired.., oxygen moves always, without exception to the positive, and potassium always without exception to the negative pole (in the electrolysis of a liquid). This gives us the means to order the bodies between those two extremes, dependent upon whether in the decomposition of their compounds they move more often to the one or to the other
190
pole. Thus a series between oxygen and potassium emerges, in which the bodies of one half can be named electronegative bodies.., and in the other half electropositive bodies. This series forms, correctly arranged, the basis for chemistry as a scientific system of facts and their causes." With these sentences Berzelius had introduced the concept of electronegativity of the atoms (we understand today that with "simple bodies" atoms were meant) as an ordering principle in the description of chemistry. We understand also, that he had actually described what is today the series of electrochemical potentials. As these ideas appeared to be applicable only to inorganic compounds, which dissociated in water, this type of classification was forgotten. It was more than 80 years later that Michael[2] recognised the significance of such an ordering principle in organic chemistry. He devised, based on Markownikow's rule, a relative electronegativity scale for the halogens. This was generalised by Kharasch[3,4], who suggested to determine the relative electronegativity of two groups R and R' through the reaction
R H g R ' + HCI ~ RHgCI + HR' , where of the two arbitrary rests R was to be the more electronegative.
2
Pauling's Definition of Electronegativity
It was left to the genius of Pauling to define the electronegativity of an atom in a molecule generally as "the power of an atom in a molecule to attract electrons to itseW'[5]. He also had provided the means for the determination of an atomic electronegativity scale[6,7]. His suggestion was based on the postulate, that the dissociation energy DAB of a single bond between atoms A and B may be obtained as the mean of the covalent dissociation energies D.4A and D s s and some extra ionic resonance energy AAB , i.e.
DAB -- (DAA + DBB)/2 4- AAB
9
(2.1)
Pauling recognised that V~AB is an additive quantity in the sense that for example
This implies that AAs corresponds to the square of the difference of atomic properties, i.e. the atomic electronegativity values AAS ~ (XA -- Xs) 2 9
(2.2)
This observation combined with eq. (2.1) allowed the generation of a relative electronegativity scale using thermochemical data. Originally Pauling had used [eV] for the energies and a proportionality constant of 1 in eq. (2.2)[6]. Later using for the energies [kcal/mol] the proportionality constant was chosen to be 23 [kcal/(mol.eV)][5,6], today with the energies in [kJ/mol] the constant should be 100 [kJ/(mol. eV)]. Having a method to determine relative electronegativities for the elements using molecular thermochemical data required the choice of an origin in order to obtain
191 an absolute electronegativity scale. Pauling did chose originally for the origin the value XH = 0 for hydrogen. Later he changed the origin to Xn = 2.1, a value that had been obtained by Mulliken[8] (see below). With this new origin conveniently simple values were obtained for the electronegativities of the elements of the second row of the periodic system, i.e.
Xv=2.5
,
XN=3
,
Xo=3.5
and
XF=4.0 ,
which did help the rapid general acceptance of Pauling's electronegativity values for the atoms. The determination of electronegativity values for the majority of the atoms, using eqs.(2.1 and 2.2), turned out to be difficult. Frequently the homonuclear dissociation energies DAA were not known and the DAn values had to be derived from atomisation energies, using the heats of formation of complex molecules. Several authors mastered these difficulties[9,10]. In particular the exhaustive treatment of Huggins [11] deserves credit. A more serious deficiency, pointing to the inadequacy of the postulate, eq.(2.1), was the fact, that the extra ionic resonance energy of the alkali hydrides turned out to be negative. To overcome this difficulty, Pauling[12] suggested to use in eq.(2.1) the geometric rather than the arithmetic mean of the homonuclear dissociation energies. This, however, was not used in a systematic way to determine electronegativity values for the atoms. Nevertheless, the concept of electronegativity as defined by Pauling rapidly became a mainstay of chemistry. A large number of chemical relations and reactions could be rationalised, using the electronegativity concept. In many cases it was possible to correlate the atomic electronegativity values directly with measured atomic or molecular properties, this i n fact lead to a number of alternate suggestions for the determination of electronegativity values. These matters have been reviewed expertly in the past[13,14]. To discuss all the numerous suggestions for the definition and use of electronegativity in detail is next to impossible. In li'eu of this, we quote for the reader a sentence from Pauling's introduction to the original edition in 1939 of "The Nature of the Chemical Bond"[5]. He wrote then:
"I have felt that in writing on this complez subject my primary duty should be to present the theory... (from my point of view) in as straightforward a way as possible... References are included to the early work in this field; the papers published in the last.., year s are so numerous, however, and often represent such small differences of opinion as to make the discussion of all of them unnecessary and even undesirable;" This sentence is still appropriate, may be even more so today than then. One reason why the concept of electronegativity was so rapidly accepted and deemed useful by the chemistry community, is probably due to the general paradigma
192 pursued by Pauling, i.e. rationalise and derive in as much as possible the enormous number of chemical observations, data, and molecular properties on the basis of the properties of the atoms, which are the constituents of the molecules. To be sure, Pauling was aware, that the whole (the molecule) is more than a sum of its parts (the atoms), but he was also aware that atoms are changed only slightly, when they become part of a molecule. As the derivation and rationalisation of molecular properties on the basis of atomic properties offers the possibility of an enormous data reduction and a better understanding of molecules and their behaviour, he pursued such an endeavour repeatedly. With his introduction of atomic electronegativities, he knew that they should vary slightly depending on the atomic environment in the molecule, he had also introduced the atomic covalent binding energies, on the basis of which dissociation energies of heteropolar bonds and thus enthalpies of reaction for molecules could be estimated easily using eq.(2.1). He also had introduced the concept of covalent radii for the atoms, permitting the estimation of bond lengths in molecules, just as the sum of the covalent radii. In the effort to use atomic properties extensively to derive qualitatively and semiquantitatively the properties of complex molecules constructed from the atoms it is useful to differentiate three types of such atomic properties. 1. Primary atomic properties as those, which can be determined "experimentally. These are nuclear charges and atomic masses, ionisation potentials and electron affinities and the spectroscopic term values of the atoms and corresponding ions. Also atomic polarizabilities and in principle the electronic charge distributions of the a~toms would correspond to this class. 2. Secondary atomic properties as those, which require, in addition to the experimentally determined quantities for the free atoms, theoretical concepts of the quantum mechanical characerisation of the electronic structure of the atoms. These are orbitals, the shell structure of atoms with emphasis of the valence shell as well as concepts like hybridisation, the definition of the valence "state" and the valence "state" promotion energy in its relation to the spectroscopic term values of the free atoms. 3. Tertiary atomic properties as those, which can not be measured for the free atoms, but can be assigned to the individual atoms on the basis of the analysis of molecular properties or in general of the interaction of atoms in gases, solids and molecules. These are covalent, ionic and van der Waals radii for the atoms as well as the covalent dissociation energies of eq.(2.1). These properties require for their determination a theoretical concept and the analysis of a large database of molecular or solid state properties. Clearly, the atomic electronegativities, as defined and determined following the suggestions of Pauling, would in this classification be characterised as tertiary atomic properties. On the other hand, primary and, as theoretical concepts will in general be required, secondary atomic properties are to be preferred, when molecular properties are to be derived qualitatively or semiquantitatively on the'basis of the properties of the constituent atoms.
193
3
Mulliken's Definition of Electronegativity
Only a few years after Pauling had introduced the concept of atomic electronegativity together with a procedure to determine relative electronegativity values for the atoms, using thermochemical data of molecules, Mulliken suggested an absolute electronegativity scale for the atoms[8]. MuUiken's definition of the absolute electronegativity of an atom is based on the consideration of a bond between the atom A and B as a valence bond resonance hybrid with covalent and ionic structures, i.e. A-B + ~
A-B
~
A+B -
(3.1)
.
The energy required to go from the covalent structure A - B to the ionic structure A+B is equal to the ionisation energy I P A of atom A minus the electron affinity E A B of atom B, while the energy required to arrive at the ionic structure B + A is I P B - E A A . The loss of the covalent binding energy and gain of the Coulomb attraction in going to the ionic resonance structures is equal in both cases, thus the structure B + A - will have more weight than the structure A + B - if, I P A -- E A s
(3.2)
> IPB - EAA
and thus atom A should be considered to be more electronegative than atom B. Adding to the left and right hand side of the inequality, eq.(3.2), E A A + E A B leads to (3.3)
IPA + EAA > IPB + EAB
as a condition that atom A is more electronegative than atom B. Based on these considerations Mulliken defined the absolute electronegativity of an atom as X=
IP + EA
2
,
(3.4)
a quantity which seemingly could be determined solely from primary experimental data of the free atoms. However, Mulllken realized already in his original definition, that this is not so, since the atoms as part of molecules considered in the VBresonance hybrid are not in their spectroscopic ground states, but should be regarded as being in valence states, ready for bonding. The theoretical concept of such valence states of the atoms, e.g. an sp3-hybrid for carbon with four single bonds, was already developed, and it was known how to describe these valence states as superpositions of specific spectroscopic states of the atoms under consideration[15-18]. Thus the energy to arrive at such valence states, the valence state promotion energy, VP could be calculated in principle using atomic spectroscopic data. With this, the ionisation energies and electron affinities to be used in the determination of electronegativities for the atoms, using the definition, eq.(3.1), had to be calculated, using the corresponding ground state values, I P ~ and E A ~ corrected with the appropriate valence state promotion energies, i.e. IP=IP~176
and
EA = EA ~ + VP ~
VP-
,
(3.5)
194 to yield X=
IP ~ + EA ~ + VP + - VP2
(3.6)
as a definition of valence state electronegativity of an atom. With this it was possible in principle to determine atomic electronegativity values based entirely on primary and secondary atomic data, without reference to molecular data. Unfortunately, Mulliken's definition of an absolute (valence state) electronegativity scale for the atoms, even though it was well founded in the quantum mechanical characterisation of molecular bonding[19], did not receive the early wide acceptance in chemistry it deserved. Using eq.(3.6) for the determination of atomic electronegativity values was hampered in the early days by a lack of experimental data. The problems were similar or even more so as with the original definition of Pauling, eq.(2.1 and 2.2). Electron affinities were known only for few elements and the atomic spectroscopic term values needed for the calculation of the promotion energies were generally lacking. This did lead to a number of different suggestions for the determination of atomic electronegativity values. We will not discuss these here, as most have just historical value and they have been reviewed in the pastil3,14]. One of these suggestions, however, deserves to be mentioned, and that is the empirical relation proposed by Allred and Rochow[20] X = 0 . 3 5 9Z~I! ~ -0.744 ,
(3.7)
as it has been used to calculate electronegativity values for most of the atoms. In this simple relation, Zey! is the effective charge of the nucleus screened by the other electrons, estimated, using Slater's rules[21] and r is the covalent radius as tabulated by Pauling[22]. This formula was easy to use, the required atomic data were readily available for most atoms, even though Zey! is a quantity of questionable value. If the proportionality constant would be dimensionless, the Allred-Rochow electronegativities would have the dimension of force/charge, however, there is no reason why the proportionality constant should be dimensionless. Systematic calculations of electronegativity values, following the definition of the absolute electronegativity scale as proposed by Mulliken in 1934, eq.(3.6), became possible in the late 50th and early 60th[13,23-25], aided by the extensive tabulation of the atomic term values[26], which were needed for the determination of the valence state promotion energies. As reliable electron affinities became available in 1975127], an extensive redetermination of the earlier values became possible[28-30]. In this reevaluation, additional term values, which had recently been identified for some atoms, have been incorporated in the evaluation of the valence state promotion energies. Before presenting and discussing the results of these recent reevaluations of Mulliken electronegativity values, it is appropriate to consider some of the more modern theoretical developments concerning the concept of electronegativity.
195
4
Orbital Electronegativity and Electrical Potential
The energy of an atom, E ( q ) , is a function of its charge q. This function, the same as any function, can be expressed as a Taylor series, i.e. OE 1 02E , E ( q ) = E ( O) + -~q q + -~ -~q2 q "'"
,
(4.1)
provided it is accepted that q can be a continuous variable. Such power series expansions of the energy of an atom as a function of its charge had been used in the past to extrapolate electron affinity values[31,32] before it was possible to measure these values reliably and extensively by experiment. In analogy to this, Iczkowski and Margrave[33] wrote for the energy of an atom E(N) = aN + bN ~ + cN 3 + dN 4 ,
(4.2)
with N = n - n o the number of electrons in excess or lacking relative to the neutral atom for which no = Z , the nuclear charge. The expansion coefficients a , b , c and d where determined by fitting the polynomial, eq.(4.2), to experimentally known ionisation energies of the atoms, and they suggested to use for the electronegativity of the atoms the electrical potential x(N)=
OE(N) ON = a + 2bN + 3cN 2 + 4dN 3 ,
(4.3)
which permitted the determination of the electronegativity not only for the neutral atoms, N = 0, but also for ions. The drawback of this definition was, that it had been restricted to the atomic ground states and that it ignored the atomic shell structure, i.e. that large jumps in E ( N ) are expected, when atomic shells identified by their ( n , l ) quantum numbers are transgressed. About at the same time, though published somewhat later, we were concerned with the calculation of valence state electronegativities following the original definition of Mulliken[23,24] and in turn we expressed the energy of an atom as a function of the charge q in a valence orbital, i.e. a hybrid orbital ready for bonding as (4.4)
E ( q ) = a + bq + cq ~
with 0 >_ q >__-2. On the basis of this expression we had defined the orbital electronegativity as[24] x ( q ) = OE(q) = b + 2cq Oq
(4.5)
Defining the ionisation energy for the orbital of the system (atom) as IP = E(0)- E(-1)
,
(4.6)
and the corresponding electron affinity as EA = E(-1)-
E(-2)
(4.7)
196 results in
3IPb=
EA 2
(4.8)
and
IP-
EA
c =
.
2
(4.9)
Thus, substituting b and c into eq.(4.5) yield for the normally orbital ready for bonding the orbital electronegativity X(-1) =
IP + EA .2
'
singly occupied
(4.10)
which is identical to the original definition of Mulliken for the absolute electronegativity. With this the valence state of the atom under consideration is included appropriately. In fact, it becomes possible to have different electronegativities for an atom in a given valence state, i.e. for an sp 2 hybridised carbon with a lr " orbital, one value for the s i g m a (sp 2 type) orbital and a different one for the ~r orbital is obtained, which is reasonable and thus desirable. Using the definitions above, the electronegativity of a doubly occupied valence orbital is readily obtained as X(-2) =
3EA- IP 2
'
(4.11)
which, however, is not the electron withdrawal power as in the verbal definition of Pauling, rather it is the resistance to electron withdrawal. It should be kept in mind, that for the evaluation of X ( - 2 ) of a neutral atom the first and second ionisation energy and the corresponding valence state promotion energies are required. These are readily accessible from atomic spectroscopic data. By the same token we can obtain in principle the electronegativity of an empty valence orbital as
3IPx(o) -
EA 2
'
(4.12)
which might become of interest in the consideration of donor-acceptor bonds, i.e. for Lewis acids. However, if this value is desired for a neutral atom, than I P of such an orbital requires already the electron affinity of the atom and the corresponding E A value, leading to a doubly negative ion, will in general not be available. Within the ~concept of orbital electronegativity as outlined identify the orbital "hardness" as
10~E(q) i:gq2
2
=
I cgx(q) Oq
2
IP:
c
=
EA 2
.
above, it is easy to
(4.13)
This is akin to the hardness parameter 7} as introduced by Pearson[34,35], thus = c in the notation we use here.
197 The idea to define electronegativity as an electrical potential has been revived within the developments of density functional theory[36]. Here a seemingly rigourous derivation of the atomic or molecular chemical potential, identified with the negative of the electronegativity, has been obtained. This definition appeared appealing, as it seemed to be of great generality, not requiring the approximate construct of orbitals, leading to a full equalisation of electronegativity throughout the system considered. However, a number of subtleties in this definition have not been spelled out explicitly in the original presentation and that led to considerable confusion[37]. Density functional theory, based on the first theorem of Hohenberg and Kohn[38], states, that there exists a general uniform energy functional for the ground state of any many electron system, depending only on the electron density p(r), unique for a given external potential v(r), here the electron nuclear, attraction. This functional may be written as
E[p] = f v(r)p(r)dr + F[p] ,
(4.14)
even though the detailed form of the functional F[p] is not known (yet?). For a given number of electrons, N, a functional N[p] defined such that
g = g[p] = f
p(r)dr
(4.15)
is obtained, and it is not required that N be a natural number. The second theorem of I-Iohenberg and Kohn states that for a given fixed number of electrons N, the energy functional E[p] may be extremalised with respect to a variation of p, leading to the stationary principle 6 { S [ p ] - #N[p]} = 0 ,
(4.16)
where the restrictive condition, that the number of electrons be unchanged in the variation of p, has been introduced multiplied with the Lagrange multiplier #, such that the expression in eq.(4.16) may be varied freely, provided p(r) remains g and v representable. Based on this Parr[36] identifies # with the chemical potential and obtains[39] ~u=
=-X
,
where the electronegativity X is identified with the negative of the chemical potential #. It is in the details of the partial derivative of E with respect to N, where the danger of confusion ties, as N depends on p, and many different changes in p can lead to the same change in N. To clarify this, consider a change in the number of electrons from N to N + 6N, then we may find, solving eq.(4.16), pN(r) and PN+6N(r) and the change in N will become 6N = N[pN+~N]-
N[pN]- f pN+~g(r)dr -- f pg(r)dr d
J
(4.1S)
If we w r i t e
pN+6N(r) = p N ( r ) + f ( r ) 6 N ,
(4.19)
198 which may be done without loss of generality for sufficiently small 6N, we obtain
(4.20)
6N = 6N f f ( r ) d r , with the obvious consequence that f ( r ) is normalised to one.
On the other hand, however, the change in N to N + 6N may also come about due to a change in the normalisation of p, i.e. such that ~N+sN(r) =
N+6N N
pN(r) ,
(4.21)
in which case we would have
](r) = pN(r) N
(4.22)
'
but 5N, let us call it here 6N, as defined through eq.(4.18) with t3N+sN(r), would have quite a different meaning. Among the many different meanings one could associate with 6N, by choosing different normalised functions / ( r ) , it is the one of eq.(4.22), which in its limit 6N ~ 0 is to be associated with the partial derivative of eq. (4.17), identified with the Lagrange multiplier #, which was introduced such as to maintain the normalisation of p(r), equal to N. Thus #, the chemical potential, is obtained as E[~N+SN]- E[pN]
6E =
r =
N[pN]
E =
(4.23) '
the mean energy per electron, which should not be identified with the negative of the electronegativity. The same result had been obtained earlier[29], where in the appendix a quite different derivation is presented. The same conclusion can also be reached if E[p]/N[p] is extremalised with respect to a variation of parameters in p. Such an extremalization is equivalent to extremalising g[p], eq.(4.16), however it is not necessary to introduce a restrictive condition with a Lagrange multiplier. Thus we obtain
6 E[p] = 0 = N[pl6E[p]- E[pl6N[p] N[p] NZ[p] "
(4.24)
Setting the numerator equal to zero and rearranging leads to E[.]
6E[p]- wr77~6N[p] = 0 , zwtp]
(4.25)
which is identical to eq.(4.16) with # = E/N. All this is in contradiction to the density functional definition of electronegativity as presented originally by Parr[36], where it is also inferred that the electronegativity for all the (natural) orbitals of the system should be the same. The discrepancy between these two different derivations and points of view[29,36] is still to be resolved. In addition, the global definition of electronegativity as promulgated within density functional theory, does not appear to be particularly useful, as it yields the same
199 electronegativity throughout a molecule, which runs counter to chemical experience with the electro- and nucleophilic centres observed and identified within a molecule. To be sure, whenever electronegativity is calculated using density functional theory, it is in general obtained for the highest occupied Kohn-Sham orbital[37], and the spatial dependence is recovered via the Fukui function f(r)[37,40] as defined through eq. (4.19). We do not advocate the concept of electronegativity as derived within density functional theory for the following reasons: 9 The theoretical discrepancies pointed out above would need to be resolved. 9 An equal electronegativity throughout an entire system, even systems which hardly interact, does not seem useful. 9 To obtain these electronegativity values, requires a calculation of the density function of the system, which can be done using density functional theory only for the ground state. After such an extensive calculation, required for the determination of the density functional electronegativity values, one has in general sumcient detailed information about the system and its observables, that the electronegativity, useful for qualitative considerations, does not seem to be needed anymore.
5
Orbital Electronegativity Values
Using the definitions of the orbital electronegativity presented in eqs.(3.5, 4.10 and 4.11) together with the latest experimental values for the ionisation potentials, electron affinities and term values for the atoms, an extensive reevaluation of the atomic valence state promotion energies and electronegativities has been carried out[28-30]. The results obtained are presented in Chart 1 and in more detail in Table 1. With these results for the electronegativity values calculated, obtained as a potential in [V], we give also the corresponding values in "Pauling Units" [PU], using the conversion xp[PU] -- 0.303xM[V ] .
(,5.1)
In addition we give the corresponding "hardness parameters"[34] in units of [V/el and a parameter b1, which permits the determination Of electronegativities as a function of the' charge on an atom, to be described below. These parameters are given here already as they were obtained directly in the recalculation of the electronegativity values.
200
Chart 1 Electronegativity Values, X, in Pauling Units and Hardness Parameters, r/in [V/e] H
, Hc
X =.2.17 = 6.42
Li
Z = 0.91 I] = 2.39
Na
Z = 0.86 ~I = 2.29
K
X = 0.73
TI = 2.20
Z=0.8
X=0.9
Ti
Z = 1.5
Y
Zr
La Z=l.l
X = 1.3
11ffi6.3s
1.62
Z=2.12
3.53
;a X
1.77 3.92
Sr
I]= 1.85
xffi. 1.4
X ffi1.2
= 1.02
X = 0.96
X=0.7
X
n
X
1.63 3.68
Ba
:1 X
C
7,,= 2.45
5.06
t
Ca
X = 0.71
Fr
X= 1.88
Mg Z = 1.21 11 = 3.05
TI = 2.38
Cs
t3
BC Z = 1.45 I] = 3.77
1]= 1.92
Rb
Sc
= 2.93
2.0
Si
N
X = 2.93 ,I] = 7.29
O
X = 3.61 = 8.26
S
P
Br
Kr
Z = 2.46 I] = 4.72
ffi 2.83
Z ffi 2.36
= 5.30
~ =6.19
Z = 2.57 lq = 4.73
X = 1.99
ffi2.2
Z = 1.8
Z=2.14 ~] =4.49
X = 2.25 I] =4.91
Z=2.12
X=2.15
= 2.29
TI = 4 . 8 8
~! = 3.94
Sn
= 3.33
Pb X=2.3
Sb
Bi X=2.0
Ar
2: = 3.05
Z = 2.46 TI = 5.32
As
Cl
Nc Z = 3.09 11 ffi 11.36
Z = 2.64 = 5.34
I] =4.4(]
Ge
F
Z=4.14 11=9.19
q9 = 14.92
Se
Te
Po Z=2.1
T! = 5.87
I
At
Z=2.71 = 6.80
Xe =5.60
Rn
rI =4.8
Ra Z = 0.9
V
Co
Cr
Mn
FC
Z = 1.6
X = 1.7
Z = 1.6
Z = 1.8
Nb
Mo
Tc
Ru
X = 1.5
Hf
Ta
X = 1.3
X,= 1.4
X = 2.2
W X = 2.3
X = 2.3
Re
X = 2.2
Ni
Z = 1.9
z = 1.4 = 3.2
Rh
Pd
Ag
i
= 2.3
Os Z = 2.2
Cu
Z = 1.9
Z=2.3
Ir
X=2.2
Z=2.2
X= 1.4
Tl=3.1
Pt
Au
Z=2.3
= 2.5
Zn z=13
11= 3.4
Cd
X = 1.4 = 3.4
Hg X=2.0
201 Table 1: Atomic Electronegativity Parameters. The parameters b~ in [V], b1 and c in [V/e] are defined through eqs.(6.6 and 6.7). In the last two columns the MuUiken and Pauling electronegativity values, XM and XP respectively, are given. For the inert gases, the electronegativity values given are those for doubly occupied orbitals. Atom lie* Li(s) Be(di) B(tr) C(te)
c(t ) C(di) N(15%s) 0(10%s) Ne* Na(s) Mg(di)
Al(t ) Si(te) P(15%s) S(10%s) At* K(s) Ca(di) Zn(di) Ga(tr) Ge(te) As(15%s) Se(10%s) Kr* Rb(s) Sr(di) Ag(s)
ca(ai) In(tr)
Sn(te) Sb(15%s) Te(lO%s) Xe*
IP 13.599 54.402
b~ EA .755 20.021 24.578 69.314
5.392 8.552 11.250 14.423 15.447 17.277 11.036 16.974 20.166 22.853 44.274 5.144 7.029 8.860 11.384 13.431 14.049 15.921 29.357 4.341 5.747 7.726 8.254 9.762 11.544 12.350 12.831 14.653 26.376 4.176 5.369 7.576 7.790 9.073 10.346 11.983 11.483 13.197 23.320
.620 7.778 1.018 12.319 1.130 16.310 1.719 20.775 2.315 22.013 3.677 24.077 .400 16.354 2.390 24.266 3.644 28.427 4.469 32.045 21.558 55.632 .560 7.436 .929 10.079 1.806 12.387 2.580 15.786 2.785 18.754 3.377 19.385 4.183 21.790 15.755 36.158 .501 6.261 .997 8.122 1.226 10.976 1.407 11.678 1.926 13.680 2.564 16.034 2.534 17.258 3.401 17.546 4.057 19.951 13.996 32.566 .484 6.022 .979 7.564 1.303 9 10.713 1.091 11.1.40 1.707 12.756 3.680 13.679 2.217 16.866 3.607 15.421 3.741 17.925 12.128 28.916
b1 c = ,/ .000 6.422 .000 14.912 .000 6.307 8.823 10.741 10.677 10.596 11.026 12.900 14.792 .000 .000 .000 5.070 6.118 7.225 8.816 10.524 .000 .000 .000 4.108 .000 5.781 6.474 7.397 9.572 9.833 .000 .000 .000 3.837 .000 5.645 6.017 5.986 7.628 8.712 .000 .000
2.386 3.767 5.060 6.352 6.566 6.800 5.318 7.292 8.261 9.192 11.358 2.292 3.050 3.527 4.402 5.323 5.336 5.869 6.801 1.920 2.375 3.250 3.424 3.918 4.490 4.908 4.715 5.298 6.190 1.846 2.195 3.137 3.350 3.683 3.333 4.883 3.938 4.728 5.596
XM 7.18 9.67 3.01 4.79 6.19 8.O7 8.88 10.48 5.72 9.68 11.91 13.66 10.20 2.85 3.98 5.33 6.98 8:11 8.71 10.05 8.95 2.42 3.37 4.48 4.83 5.84 7.05 7.44 8.12 9.36 7.81 2.33 3.17 4.44 4.44 5.39 7.01 7.10 7.54 8.47 6.53
XP 2.17 2.93 .91 1.45 1.88 2.45 2.69 3.17 1.73 2.93 3.61 4.14 3.09 .86 1.21 1.62 2.12 2.46 2.64 3.05 2.71 .73 1.02 1.36 1.46 1.77 2.14 2.25 2.46 2.83 2.36 .71 .96 1.35 1.35 1.63 2.12 2.15 2.29 2.57 1.99
202 Some additional comments to the results presented in Chart 1 and Table 1 are in order. Detailed calculations have been possible only for those elements included in Table 1. In this table we report here only in the case of carbon different values for different valence states. Corresponding results for a large number of valence states for different elements have been documented elsewhere[28-30]. Here we have chosen for the different elements just those results corresponding to the most probable valence states. To this end we report here the values for the elements of the main groups 5 through 7 with 15%, 10% and 5% s-character per bonding orbital, respectively. A somewhat more detailed discussion of these results has been published recently[41].
6
Electronegativity Equalisation and Charge Distribution
Now we have on the basis of atomic spectroscopic data the electronegativity values for atomic orbitals ready for bond formation. These values are in accord with the conventional atomic electronegativities obtained in a much less direct way. To be sure, it was necessary here to use, in addition to the atomic spectroscopic data, the concept of orbitals and atomic valence state. The latter is, in particular for atoms with an extended valence shell, for five-binding phosphorus or sulfur, which may be four- or six-binding, somewhat problematic in its definition. This is even more true for the transition metals. But in those cases we may still use an average, global electronegativity value for the atoms, as it is necessary in all other definitions of electronegativity. On the other hand, the concept of orbital electronegativity may be extended readily to describe the charge transfer in bond formation, thus permitting the quantitative determination of the charge distribution in a molecule. The definition of the orbital electronegativity for atomic orbitals ready for bonding as an electric potential provides an obvious rationalisation for the concept of electronegativity equalisation between the two atomic orbitals used to form a localised molecular orbital in bond formation. Such an equalisation had been suggested by Sanderson[42-44]. If an orbital i on atom A joins with orbital j on atom B to form a bond, a transfer of charge Aq will occur from the less electronegative orbital to the more electronegative one, thus lowering the energy of the system until the electronegativities of the bond forming orbitals are equalised. This leads to
XiA(q~ + A q ) = XiB(q~ -- Aq) ,
(6.1)
with q0 the reference charge in the two orbitals respectively before bond formation and Aq the charge transferred from atom B to atom A. Using this together with eq.(4.5) permits the immediate determination of the charge transferred[23,24]
= x (q ~ - x,(q o)
z( j +
"
(6.2)
203 The energy lowering due to the charge transfer is model[14, 23] as AE
-- [ x j ( q ~
obtained within this simple
_ xi(q~
4(cj + c,)
'
(6.3)
which is reminiscent of the extra ionic resonance energy, compare eq.(2.2), used by Pauling in his original definition of electronegativity. That this admittedly oversimplified model yields reliable and useful results may be seen in the approximate cancellation of two opposing effects not considered: (i) due to the charge transfer, the covalent bond is weakened, which corresponds to a decrease of the bond energy, while (ii) due to the charge transfer a Coulomb attraction is generated between the bonded centres, which corresponds to an increase in the bond energy. Some investigations of the concept of electronegativity equalisation[14,45] do suggest modifications to eqs.(6.2 and 6.3), however improvements, which could be implemented easily, have not evolved as yet. The concept developed above, to use the electronegativity equalisation in a local two electron bond, to determine the charge transferred is still too limited for atoms in molecules in general. An atom in a molecule may have several different ligands to which bonds are formed and charge is transferred. To this end we have to extend the concepts presented above and consider a n atom with m orbitals, forming m localised two electron bonds to its ligands. In each bond the charge Aqk, for k = 1 through m, will be transferred. Thus the total net charge on the atom considered will become 9Tt
Q-~Aqk
9
(6.4)
k-1
With this, the electronegativity of an orbital i of this atom will not only depend on the charge Aq{ transferred in the bond it is engaged in, but also on the residual charge Err
,., -
Q -
=
(6.5)
,
due to the charge transferred in all the other bonds the atom considered has formed. With this we obtain an extension of the charge dependence of the orbital electronegativity as given by eq.(4.5) to x,(,',,
q,) = b(,.,) +
.
(6.6)
As the resulting residual charges are small in general, and since the parameter c = (IP-EA)/2 depends only weakly on the residual charge, it can be assumed to be constant, while for the parameter b(r) a: linear approximation of the type b(r) = b~ + b'r
(6.7)
appears adequate. I t ' i s fortunate, that b~ as well as b1 can be calculated using spectroscopic data determined for the free atoms and ions. The parameters obtained were given here already in Table 1; these parameters for additional valence states of
204 the atoms have been obtained and are tabulated elsewhere[30]. With the parameters b~ b1 and c at hand for the different atomic orbitals we can write the orbital electronegativity in its dependence on the charge in the orbital q = q0+ Aq and the residual charge r on the atom as xi(r,, qi) = b~ + b~ri + 2ciqi .
(6.8)
For a molecule with N bonds, each formed between two atomic orbitals, say i and j, of two adjacent atoms, we have the bond index (ij) for 1 through N. The requirement of electronegativity equalisation within each bond xi(r,, qO + A q i ) = xj(rj, qO_ Aqi)
(6.9)
yields N hnear equations (for details see ref.[29,30]), which may be solved readily for the N values of Aqi; note that Aqj = - - A q i . This provides an easy means to determine the partial charges in any molecule, just using the concept of orbital electronegativity, with the parameters b~ b1 and c determined for the orbitals from atomic spectroscopic data. As the concept of electronegativity equahsation used is restricted to localised two electron bonds, two additional provisions are required: (i) in the case two atoms are joined by multiple (a and ~r) bonds, these have to be treated together, otherwise an instabihty in the hnear equations yields an unreasonable ~ donation balanced by an almost equally large r back donation; (ii) in the case of conjugated systems, the corresponding dominant Lewis structures with localised bonds need to be treated separately and the results averaged appropriately. The final result obtained is fortunately not very sensitive to the weights used in this averaging.
7
Molecular Properties
The concept of electronegativity with the corresponding values for atoms and their valence orbitals, derived using atomic spectroscopic data, would be of httle value, if it could only be used to obtain partial charges of the atoms in a molecule, as the partial charges, even though they give some idea of the electronic charge distribution in a molecule, can not be measured experimentally. The usefulness of the electronegativity concept rests with the rehability with which observable molecular properties can be correlated or even predicted with the aid of electronegativity values or the derived partial charges. We have shown in the past that it is possible to use the concepts presented above to predict ESCA chemical shifts, proton affinities, NMR chemical shifts, force constants, group electronegativities and hardness parameters for molecular fragments[29,46]. We will not present the correlations found again, rather we will focus here on two properties, which can be predicted quite rehably, using the electronegativity concept presented above, (i) equilibrium bond lengths and (ii) bond energies, as these can provide a basis for the reduction of parameters required in molecular mechanics models for the prediction of molecular structure and molecular properties[30].
205
7.1
Bond
Lengths
Already the simple suggestion by Pauling[47] to estimate the bond length of a covalent bond in a molecule simply as the sum of the two covalent radii of the corresponding atoms (7.1)
dAB = rA + rB
is extremely useful. It allows for an enormous data reduction by the introduction of the atomic property covalent radius (a tertiary atomic property as defined in section 2). There have been several attempts to improve this relation, taking into account a modification due to the electronegativity differences of the two bonded partners[48,49]. Most promising appeared an ansatz due to Sanderson[43], who proposed to modify the covalent radii due to the partial charges on the atoms considered, i.e. r = r ~ + dQ
.
(7.2)
With this we would have two parameters per atom, the covalent radius r ~ and a softness parameter d. Building on this, we could demonstrate, that it is not necessary to introduce such an extra softness parameter for each atom, rather it suffices to write --
/
Q=o
--Co+--
77
,
(7.3)
where we just have two global parameters, the same for all atoms, while the hardness parameter ri = c, see eq.(4.13), is determined in conjunction with the electronegativity for atomic orbitals on the basis of atomic spectroscopic data. Combining eqs.(7.1 through 7.3) yields for the bond length dAB -- rOA "3t- rOB "Jr-c o ( Q A "3t- Q B ) ' 3 I- (21. Q A .31_ QB
~A
riB
+ c2
,
(7.4)
(riA + riB)dAB
where we have introduced in addition the last term with another global parameter c2, in order to account for the Coulomb interaction due to the direct neighbour. With this term a non linearity has been introduced, but fortunately this term is small, though significant, thus it is in general sufficient to approximate the dAB in the denominator simply by the sum of the corresponding covalent radii. The parameters in eq.(7.4), 51 covalent radii and the three global parameters c0, Cl and c2, where determined by fitting the expression to bond length data for 543 bonds, derived from spectroscopic structure determinations of the corresponding molecules in the gas phase. Including also data from X-ray structures would be inappropriate here, as packing effects are not included in the model above, and in the case of solid state structures, the partial atomic charges can be significantly different, due to their amplification in a solid caused by strong multi-pole forces, which stabilise the crystals. The correlation for the regression, displayed in Fig. 1, is excellent. The correlation coefficient obtained was r = .997, with a standard error of cr = .028~ and a relative standard error of 1.3%. The covalent radii, r ~ and the
206
global parameters co, cl and c2 are given in Table 2. Note the different covalent radii for different valence states for the same atom! Using just the covalent radii as displayed in Table 2 with the simple eq.(7.1) will yield already bond lengths with an error of about 5%, using eq.(7.4), where electronegativity and hardness parameters are required together with the partial charges, reduces this error to less than 1.5%. Figure 1: Bond Length in [~], Experiment versus Computed using eq.(7.4). 3.5
3
2.5
-
;-
0 0<: '---' "7 D.,
2
-
1.5
-
E o s,)
1
-
0.5
.... t 0.5
1
......
I
...
1.5
d
t
I
2
2.5
(exp.)[~]
...I 3
3.5
207
T a b l e 2: Atomic Orbital Covalent Radii in the p a r a m e t e r s co, cl and c~.
Atom
Hybrid
H BB C C C N N+ N O F Si P P P S S C1 K Ge Se Br Rb Sn Te I
s te di te di di or te te di or or tr or te te p pe te or te p ~r p p p or p p p p ~r p or te
r cl c2
[~] [-~-] [-~]
rai
[/~]
0.343 0.880 0.831 0.763 0.710 0.617 0.695 0.767 0.562 0.552 0.726 1.106 1.097 1.006 0.905 1.016 0.944 0.988 2.198 1.082 1.157 1.135 2.374 1.278 1.275 1.284
0.075 -0.844 14.558
Atom
Hybrid
Li B B C C C N N O F Na Si P P+ S S S C1 Ge As Se Br Sn Sb I
p tr di or tr tr or di or or tr ~r or te p p p p te te p oh te or te te p p or te te p p
[~],
rAi
see eq.(7.4) for the definition of
[/~]
1.502 0.831 0.619 0.747 0.664 0.603 0.686 0.601 0.686 0.569 1.793 1.002 1.018 1.022 1.023 1.029 0.883 0.974 1.192 1.220 1.074 1.084 1.379 1.394 1.338
208 7.2
Bond
Energies
Again we can take as a basis for our considerations the original suggestion of Pauling, to express the dissociation energy of a bond between atoms A and B as
1 (DAn + DBB)+ AAS , DAB = -~
(7.5)
which he had used to define the electronegativity of the atoms via the extra ionic resonance energy AAB. With this he had introduced also the atomic covalent binding energies, DAA, as tertiary atomic properties, which could be determined from a large data base of bond energies or atomisation energies, derived from experimentally measured heats of formation. But now, that we have independent means to obtain orbital electronegativities, using primary and secondary atomic data, and that we have a relation for the extra ionic resonance energy, eq.(6.6), we can improve the relation above. Using this knowledge and some considerations based within valence bond theory, we developed and tested for the prediction of bond dissociation energies the relation[30] DAs = 1 +
1 {1 (XA - XB) 2 QAQB} ]QAQsl -2 (DAA + DsB)+ c3 4(T/A + r/B) + c4 dAB '
(7.6)
where we have introduced the two global parameters c3 and ca to fit the data. Again it is important to use, especially in the case of carbon, valence orbital specific values for the covalent dissociation energies DAA. The prefactor in the relation above reflects the weakening of the bond, if partial charges on the bonded atoms become significant, and the last term reflects the Coulomb interaction between partially charged neighbours. It can become attractive or repulsive. The latter case is effective in explaining the weak O-O or N-N bonds in e.g. H202 and N2H4 respectively. The correlation obtained between experimentally determined dissociation energies[50] and the relation, eq.(7.6), is displayed in Fig. 2, and the parameters we obtained in this regression are presented in Table 3. The resulting correlation coefficient was r = .987 and the standard error a = 5kcal/mol. Even though the correlation obtained is already quite satisfactory, we do hope to improve on it in the future, using a much larger data base to be derived from heats of formation. Finally it should be pointed out, that in the estimation of bond lengths and bond dissociation energies, using eqs.(7.4 and 7.6), the terms including partial charges Q are generally quite small, thus it will suffice in many cases to estimate this partial charges on chemical grounds, alleviating the need to compute these charges explicitly.
209
F i g u r e 2: Bond Dissociation Energies in 250
O
,
200
-
150
-
,
[kcal/mol],
m
Experiment versus Computed.
,
u "7'.
E o u
100
aa
50
0
I 0
I
50
I
100
DAB(exp.)
I
150
200
[kcal/mol]
T a b l e 3: Covalent Bond Dissociation Energies in definition of the parameters c3 and c4.
Atom
Hybrid
DAA
Atom
Hybrid
H C C C O Na P C1 Ge Rb Sb
s te di di 7r lr p s p p te s te
107.5 85.6 139.2 235.8 46.9 16.2 57.3 62.2 62.4 7.1 23.2
Li C C N F Si S K Br Sn I
s tr tr ~r te p te p s p te p
kcal mol.eV kcal.~
18.36
C3 C4
m~
2
-751.98
250
DAA 33.0 (103.) 174.9 78.2 49.7 75.9 63.2 10.6 44.6 45.7 37.1
[kcal/mole],
see eq.(7.6) for the
210
8
Conclusion
We have come full circle, starting with eq.(2.1) for the dissociation energy of a chemical bond, which Pauling had used more than 60 years ago to introduce the concept of electronegativity into chemistry, we have described a way to calculate such electronegativity values, using spectroscopic data of the free atoms. Turning around, we have then used the atomic orbital electronegativities with a slightly modified expression for the bond dissociation energies, eq.(7.6), to calculate the latter. Thus, originally the dissociation energies were the input to compute electronegativity values, now the electronegativity values are the input to calculate bond dissociation energies. Similarly it is possible to determin, using the electronegativity values for the atomic valence orbitals, all those molecular properties, on the basis of which a definition of electronegativity had been suggested in the pastil3,14]. With this, the full value of the electronegativity concept introduced into chemistry by Pauling could be realized. Standing on the shoulders of scientific giants like Pauling, provided one can climb up, opens great new vistas. A c k n o w l e d g m e n t : The authors are grateful for the financial support of this work due to the "Fonds der Chemischen Industrie".
References [1] J. J. Berzelius, Lehrbuch der Chemie (Arnoldsche Buchhandlung, Dresden and Leipzig, 1835), Vol. 1, p. 163, translated by F. W6hler.
[2]
A. Michael, Chem. Ber. 39, 2138 (1906).
[3] M. S. Kharasch and M. W. Griffin, J. Am. Chem. Soc. 47, 1948 (1925). [4] M. S. Kharasch and R. Marker, J. Am. Chem. Soc. 48, 1463 (1926). [5] L. Pauling, The Nature of the Chemical Bond (Cornell UP, Ithaca, New York, 1939). [6] L. Pauling and D. M. Yost, Proc. Nat. Acad. Sci. US 18, 414 (1932). [7] L. Pauling, J. Am. Chem. Soc. 54, 3570 (1932). [8] R. S. Mulliken, J. Chem. Phys. 2, 782 (1934). [9] R. Daudel and P. Daudel, J. Phys. Radium 7, 12 (1946).
[10] M. naissinsky, J. phys. 7, 7 (1946). [11] M. L. Huggins, J. Am. Chem. Soc. 75, 4123 (1953).
211 [12] L. Pauling and J. Sherman, J. Am. Chem. Soc. 59, 1450 (1937). [13] H. O. Pritchard and H. A. Skinner, Chem. Revs. 55, 745 (1955). [14] J. Hinze, in Mehrelektronen-Modelle, No. 9.3 in Fortschritte der Chemischen Forschung, edited by K. Hafner et al. (Springer, Berlin, Heidelberg, New York, 1968)~ p. 448. [15] J. Slater, Phys. Rev. 34, 1293 (1929). [16] J. H. V. Vleck, J. Chem. Phys. 2, 20 (1934). [17] E. U. Condon and G. H. Shortley, The Theory of Atomic Spectra (Cambridge UP, Cambridge, UK, 1953). [18] W. Moffitt, Rep. on Prog. Phys. 17, 173 (1954). [19] W. Moffitt, Proc. Roy. Soc. (London)A202, 548 (1950). [20] A. L. Allred and E. G. Rochow, J. Inorg. Nuc. Chem. 5, 264 (1958). [21] J. C. Slater, Phys. Rev. 36, 57 (1930). [22] L. Pauling, J. Am. Chem. Soc. 69, 542 (1947). [23] J. Hinze and H. H. Jaffd, J. Am. Chem. Soc. 84, 540 (1962). [24] J. Hinze, M. A. Whitehead, and H. H. Jaff6, J. Am. Chem. Soc. 85, 148 (1963). [25] J. Hinze and H. H. Jaff6, Can. J. Chem. 41, 1315 (1963). [26] C. E. Moore, Atomic Energy Levels (Natl. Bur. Stand., Washington DC, USA, 1959), Vol. I-III, and later additions. [27] H. Hotop and W. C. Lineberger, J. Phys. Chem. Ref. Data 4, 539 (1975). [28] D. Bergmann, Master's thesis, Fakults Bielefeld~ 1985.
ffir Chemie, Universit~it Bielefeld, 33615
[29] D. Bergmann and J. Hinze, in Electronegativity, No. 66 in Structure and Bonding, edited by K. D. Sen and C. D. J0rgensen (Springer, Berlin, Heidelberg, New York, 1987), p. 145. [30] D. Bergmann~ Ph.D. thesis, Fakults Bielefeld, 1992.
ffir Chemie, Universits
Bielefeld, 33615
[31] L. M. Brandscomb and S. J. Smith, J. Chem. Phys. 25, 598 (1956). [32] H. R. Johnson and F. Rohrlich, J. Chem. Phys. 30, 1608 (1959). [33] R. P. Iczkowski and J. L. Margrave, J. Am. Chem. Soc. 83, 3547 (1961).
[34] R. G. Parr and R. "G. Pearson, J. Am. Chem. Soc. 105, 7512 (1983).
212 [35] in Chemical Hardness, No. 80 in Structure and Bonding, edited by K. D. Sen (Springer, Berlin, Heidelberg, New York, 1993). [36] R. G. Parr, R. A. Donnelly, M. Levy, and W. E. Palke, J. Chem. Phys. 68, 3801 (1978). [37] in Electronegativity, No. 66 in Structure and Bonding, edited by K. D. Sen and C. K. J~rgensen (Springer, Berlin, Heidelberg, New York, 1987). [38] P. Hohenberg and W. Kohn, Phys. Rev. B 136, 864 (1964). [39] M. B. Einhorn and R. Blankenbecler, Ann. Phys. 67, 480 (1971). [40] R. G. Parr and W. Yang, J. Am. Chem. Soc. 106, 4049 (1984). [41] D. Bergmann and J. Hinze, Angew. Chem. Int. Ed. Engl. 35, 150 (1996). [42] R. T. Sanderson, Sience 114, 670 (1951). [43] R. T. Sanderson, Chemical Bonds and Bond Energy (Academic Press, New York, 1976). [44] R. T. Sanderson, Polar Covalence (Academic Press, New York, 1983). [45] J. Cioslowski and S. T. Mixon, J. Am. Chem. Soc. 115, 1084 (1993). [46] D. Bergmann and J. Hinze, Angew. Chem. 108, 162 (Berichtigung 849) (1996). [47] L. Pauling and M. L. Huggins, Z. Krist. A87, 205 (1934). [48] V. Shomaker and D. P. Stevenson, J. Am. Chem. Soc. 63, 37 (1941). [49] O. E. Polansky and G. Derflinger, Theoret. Chim. Acta 1, 308 (1963). [50] E. R. C. Weast, Handbook of Chemistry and Physics, 68 ed. (CRC Press, Boca Raton, F1, 1988).
Z.B. Maksid and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond
213
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
O n H y b r i d O r b i t a l s in M o m e n t u m
Space
B. James Clark, Hartmut L. Schmider, and Vedene H. Smith, Jr. ~ * Department of Chemistry, Queen's University, Kingston, Ontario, Canada KTL 3N6 Hybrids constructed from hydrogenic eigenfunctions are examined in their momentumspace representation. It is shown that the absence of certain cross-terms that cause the breaking of symmetry in position space, cause inversion symmetry in the complementary momentum representation. Analytical expressions for some simple hybrids in the toomentum representation are given, and their nodal and extremal structure is examined. Some rather unusual features are demonstrated by graphical representations. Finally, special attention is paid to the topology at the momentum-space origin and to the explicit form of the moments of the electron density in both spaces. 1. I N T R O D U C T I O N Among the countless concepts that Linus Pauling introduced from Quantum Mechanics into chemistry[I,2], and that became standard principles of the trade, there is the idea of "hybridization". In the framework of the valence-bond description of a system, it is useful to mix atomic orbitals of the same "n-quantum number", or of similar spatial extent, to construct directed, asymmetric atomic contributions. Although hybrids are not needed in an LCAO-MO description of the system, they have so much become part of the language of both organic and inorganic chemistry, that people will go out of their way to arrive at descriptions that are compatible with them. A much lesser known contribution of Pauling to the chemical knowledge, is his explicit expression for the momentum representation of the hydrogenic wave function [3]. Momentum space concepts are common among scattering physicists, some experimental chemists and a few theoreticians; however, they have not won over the bulk of chemists nearly as efficiently as the hybrid concept. The reason is that they are somewhat counter intuitive and molecular structure is expressed in a rather indirect and (in the truest sense of the word) convoluted manner. The two concepts have on occasion been brought together; Coulson and Duncanson[4] gave an explicit formula for sp-orbitals based on Slater type orbitals (STO's). Rozendaal and Baerends used hybrids to describe chemical bonding in a momentum representation [5], and more recently, Cooper considered the shape of sp hybrids in momentum space, and their impact on momentum densities [6]. We would like to have a closer look at them, in terms of their functional behavior, their nodal structure and their topology. We will do *The Natural Sciences and Engineering Research Council of Canada (NSERCC) supplied financial support for this work.
214
so with a focus on single-center contributions, in other words we will restrict ourselves to atoms, and will only touch upon molecular aspects. Our model system is a hydrogen-like ion, for two reasons: first it is simple, and fulfills the basic premises of energetic degeneracy of the angular contributions; secondly, it was introduced by Linus Pauling, and it is he whom we are honoring, after all. 2. F O U R I E R T R A N S F O R M S
OF P O S I T I O N - S P A C E H Y B R I D S
To obtain the momentum distribution ~rr due to a single hybrid orbital r it is necessary to perform a Dirac-Fourier transform. The square magnitude of the resulting momentum orbital r is the contribution of r to the momentum density;
~(ff) = (271-)-3/2 / ~)(r') e-iyyi'd~ -
(1)
Note, that the hybrid r is real. It may be written as a combination of an inversionsymmetric part Cs and an antisymmetric part Ca. The Fourier transform will map the former onto the real part of r and the latter on its imaginary part, both of which are symmetric themselves. As a result, the square-magnitude of r is inversion symmetric with respect to i7=0 (as momentum densities should be) [7]. Commonly (in position space), hybrid orbitals are written in terms of single-center linear combinations of basis functions that are themselves products of radial parts and real spherical harmonics. Let us consider
where we designate 0 as the polar angle, and r as the azimuthal one. Here, the following assumptions are made; the radial function Rz(r) is the same for all basis functions of the same "/quantum number", and its dependence on a "shell quantum number" n is of no consequence. The coefficients a~ describe the contribution of s, p, d, ... character to the hybrid, and the b~m govern the shape and orientation of that contribution. Stm are the real surface harmonics, defined in terms of the spherical harmonics (Ylm). Slr~ -- N •
{ Re(Yll.q) if m _> 0 Im(Yzlml)if m < 0
(3)
Here, N is a normalization factor, chosen such that f I Sire ]2 dgt = 1. The original justification for constructing hybrids was that the radial behavior (in position space) of the constituting functions is rather similar, and that they are energetically near-degenerate. These conditions are fulfilled exactly for the eigenfunctions of a one-particle Coulomb system, the hydrogen-like ions with nuclear charge Z. We will therefore illustrate a few concepts on those. The radial behavior of the hydrogenic eigenfunctions in position and momentum space is exponential and "Lorentzian", respectively, and their nodal structure depends on the associated Laguerre and Gegenbauer polynomials, respectively: RH(r)
--
2n(n + l)!
-~-l-~
(4)
215
N
R.(p)
=
(-i)~
p
2 Z ( n - l - 1)' l!(Z/n)t+2(4p) t+x p t+x 7r(n -4-l)! -((--'~n)2;p~)l+2 "'n-l-1
(p2/zjn 2) p -t- (Z/Tt) 2
(5)
The momentum-space expression was first given by Podolsky and Pauling [3] in 1929. Note, that for any real position function RH, the corresponding momentum radial function RH will be either purely real or purely imaginary, depending on whether the angular part of the orbital is even or odd (see also [8]). The factor ( - i ) l in Eqn.(5) has, e.g., the consequence that s-type and p-type functions do not "mix" in momentum space, which leads to hybrids that have a different nodal structure. 3. H Y B R I D S I N M O M E N T U M 3.1.
Hybrids
SPACE
of the spa-type
The most commonly encountered hybrids in organic chemistry are linear combinations of s and p-type orbitals. Depending on the linear coefficients of the real p-functions in the three Cartesian directions, the resulting set of hybrids will be oriented at different solid angles from each other. For example linear combinations of the form 1/2(s :t:px :t:py-t-pz) will yield hybrids with a tetrahedral angle among them. This basic geometry will be retained in momentum space, since the Fourier-transform (1) is a direction-preserving, unitary transformation. However, the fact that even and odd contributions to the positionspace hybrid are transformed separately (via cosine and sine transforms) into real and imaginary parts of the momentum-hybrid, means that the resulting densities (i.e., the square magnitudes) are inversion symmetric with respect to the origin, which is obviously not the case in position space. Therefore, an arrangement of orbitals in a point group G in position space, will lead to an arrangement in the point group G' = G x Ci in momentum space; this is the direct product of G with the inversion-symmetric group (see [7] and [9]). For the following considerations, the orientations of the hybrids are not relevant, and we therefore mix only pz orbitals with s-functions, resulting in an orbital that points along the polar axis z in both spaces. This takes the following form for the n = 2 shell:
_ ~
+
-
[2-
+
co
0]
=
,
(6)
S i~r(1 + a) r
= -~
(r
In position space, r 1 + v/-acos 0 r - - 2 Z(a cos 2 0 - 1)"
+ x/~r
- 16 Z s/2 [4p 2 - Z 2 - 4 ix/-dZpcosO] 7 r ~ + a ) ( Z 2 + 4p2) 3
(7)
is real and its nodal surface is defined by
(8)
In contrast, the real part of the momentum hybrid ~,~ has a spherical node (at p -
~ r ~ } _ Z/2), whereas the imaginary part has a planar one (at 0 - 7c/2). This means
216 ~a
that the subspace, where r - 0, is one-dimensional. It is, in fact, a circle of radius p = Z / 2 in the xy-plane 2, with center at p = 0. Coulson and Duncanson [4] obtained an analytical expression for a momentum hybrid of a C-atom, constructed from STO's that shows similar structure. Later, Cooper and Loades gave contour plots of several similar hybrids in momentum space [6]. However, neither one did comment on this interesting feature. It is widely known that total momentum densities for atoms are not always monotonically decreasing [10]. In fact the degree of non-monotonicity is dependent on the degree of p-population in an atom. This fact is visible as well in the shape of s p a hybrids in momentum space.
Figure 1. Surface plot of the orbital densities for sp a hybrids in momentum space. The hybrids are based on the hydrogenic wave functions. The three plots pertain to a = 1, 2 and 3, respectively. The hybrids point in the z-direction. A section through the density in the xz-plane is displayed.
In Fig.(1), we display the momentum density contributions of commonly encountered hybrid orbitals, obtained from hydrogenic eigenfunctions with Z = 1. The figure shows surface plots of the densities for s p a in the xz-plane for a = 1, 2 and 3. It may be seen that, while the s p hybrid exhibits a maximum at p = 0, greater p-contributions flatten this maximum out, leading to a plateau for s p 2, and finally a saddle point for s p 3. All of these densities feature two points in the xz-plane where the density vanishes exactly. They are situated on the x-axis, as sections along that axis demonstrate clearly. We show those in Fig.(2). Independently of the mixing coefficient a, those "nodal points" occur at x = -1-1/2 on each equatorial axis. They are the intersection of the aforementioned nodal circle with the displayed plane. To further assess the extremal structure of the densities corresponding to our hybrids, we also display the curvature of the density along the polar axis in Fig.(3). The analytical expression for these curves is 2Throughout the article, we use the notation x, y, z for the Cartesian components of the momentum vector p, and 0, r for its angular spherical coordinates. This is done to avoid excessive subscripts, and confusion with p-orbitals.
217
027r~p(ff) I
8192(a
-
2)
p=o = Z57c2( a + 1)
OP2z
(9)
Therefore, clearly the sp2-hybrid is the limiting case, for which the transition from maximum to saddle-point occurs, independently of the nuclear charge Z. We note that in the paper of Cooper and Loades [6], this distinction was not found for the momentum hybrids. Using Gaussian orbitals, with exponents tailored for the carbon atom, the origin is a saddle point for all three hybrids. For the hybrids that we have constructed, maxima will occur for any a > 2 at
Z Pmax -- ~-
(6 - 10a + 2x/i " 26a + 25a 2) 1/2
(10)
As we should expect, increasing the amount of p character, by increasing the value of a, will shift the maxima of the momentum distribution, to higher values of p. This fact well illustrates Epstein and Tanner's H y b r i d Orbital Principle, which states that "increased p character in an s - p type hybrid orbital results in increased density at high momentum". [11] In addition, one should contrast the appearance of the plots from Fig.(1) with our more familiar position space representation of these sp a hybrids, as shown in Fig.(4). There, we always see two distinct maxima [6], the sharper of which is located right at the origin.
3.2. Hybrids involving d-orbitals For most of organic chemistry the description in terms of sp a hybrids is sufficient for a qualitative picture. However, if the coordination numbers involved are greater than 4, as is the case for the majority of compounds involving transition metals, d-hybridization has to be taken into account. Since the m-quantum number of a d-function influences not
Figure 2. Sections through the orbital momentum densities displayed in Fig.(1) along the x-axis with z = 0, i.e. perpendicular to the main axis of the hybrid. The three plots pertain to a = 1, a = 2 and a = 3, respectively. Note that each density vanishes at two points on the x-axis.
o-1I
o-1I
0.08
01I
0.08
0.08
0"02 I -1
-0.5
oo '
-
-
.
oo
oxs
1
~
5
'oo
218
Figure 3. Second derivative of the orbital momentum density, 027r~p(ff)/Op~, in the z-direction, along the polar axis. Note that for a = 1 (left plot), this quantity is negative around p = 0, for a = 2 (middle plot) it is exactly zero, indicating a plateau, and for a = 3 (right plot) it is positive, denoting a saddle point. ioo
-
2oo~
Figure 4. The sp, sp 2, and sp 3 hybrid density functions in the xz-plane of position space. As is often the case, orbitals that are quite different from one another in momentum space, can appear very similar in the corresponding position space representation.
only the orientation, but also the shape of the constituting atomic orbitals, we have to distinguish several cases. In the following, we focus on two of them; the first are hybrids that are directed along the z-axis, such as the ones in an octahedral complex, which have the general form 3sp~dbz2. The second lies in the xy-plane, and is exemplary of a hybrid that would be used to describe bonding in square-planar compounds. These hybrids are of the form 2 spxax2_y~ a-b . Other combinations are possible, but will in general show similar structural features.
219 The first class takes the following form (for the n = 3 shell) in momentum space:
~3sp~d2-a,b=
x/'l +la + b (~3s (#) + k/cd~3p~ (#) + k / ~ 3 d 2 (#))
18 3x/3--Z~ (81v/-2p4
--
6Z2p 2 [5~/-2 + 4x/b (3 cos20 - 1 ) ] + v/2Z 4)
(7rV/1 + a -t-b) (Z 2 + 9p2) 4 i 432pcosOv/Z7a
+
(Z 2 - 9p 2)
(~rv/1 -'~ a --}-b) (Z 2 nt- 9p2) 4 "
(11)
In the above equation, the hybrid is clearly broken down into a real part (second line), and an imaginary part (third line). We have found it convenient to analyze these two parts of the hybrid separately because of the earlier mentioned property that the real part of a hybrid will not "mix" with the imaginary part when computing expectation values and densities. The real part of Eqn. (11) arises from the mixing of s- and d-contributions. It has roots on either side of the xy-plane, on two closed C-rotation-symmetric surfaces, that are
Figure 5 9 Nodal surfaces of a s~3d ~'z 2z2 hybrid orbital with Z = 1 in the momentum-space representation. The left-hand plot contains two surfaces. One is the spherical node of the imaginary part. The second more complex surface consists of two closed and flattened spheres. These are the nodal surfaces belonging to the real part of the hybrid and are aligned along the z-axis. The intersection of the two types of nodes are two circles around the z-axis. The right-hand plot displays a cut through the xz-plane. Note that the (polar) z-axis is the horizontal axis in this plot. To avoid confusion, the nodal planes of the imaginary part are not displayed in either graph.
1IX f .. ...............~.5
"'\ z /,," ....................
-0.5
220 the solutions of a quadratic equation in p2. Figure (5) shows these surfaces (left plot), as well as a section through the zz-plane (right plot). The central sphere in the left plot and the circle in the right plot are nodes of the imaginary part of the hybrid. This imaginary part (third line of E q n . ( l l ) has nodal surfaces which consist of the xy plane (0 = 7r/2, not shown in the plots), and a sphere centered at p = 0 with radius Z/3. As ~a b a result, the density (i.e. the square magnitude of Cs;d,z), vanishes on a pair of circles at pz = =i=z [3 ( 1 - ~ r 1,2 with radius gz [ 2 / 3 + x/~/6v/~l U2 These circles are the intersection of the roots o t the real and the imaginary parts.-This nodal behavior is in clear contrast to the one in position space, where nodal surfaces of a rather complex shape are observed.
Figure 6. Surface plots of the momentum density corresponding to two different hybrids of the 3 b form sp~dz2. For the left plot, b = 2 (as in octahedral hybrids), for the right one, b = 5 (maximum bond strength). The surface plots are sections through the xz plane. Note that they both exhibit four points where the density vanishes exactly.
Fig.(6) shows the momentum densities corresponding to two different hybrids of the type. The first one is a sp3d 2 orbital as is encountered in octahedral complexes, the second one is a "maximum-bond hybrid" sp3d 5. The basic features are rather similar, although somewhat more strongly pronounced in the one with the greater d-component. The plots show a section through the densities in the zz-plane. Particularly for the sp3d 5 hybrid, a set of 4 "holes" parallel to the x-axis can be observed, arising from the aforementioned circular nodes. To show these "holes" more clearly, we have plotted in Fig.(7) circular sections through the densities, passing through the xz-plane with radius 1/3. The graphs show the momentum density as a function of the polar angle 0 in units of 7r. Note that the density is maximal inthe z-direction (0 = tTr;t = 0, 1,2), and in x-direction (0 = tTr;t = 1/2, 3/2), s -pz%~ a ..tb
221 Figure 7. Circular sections through the momentum densities displayed in Fig.(6). The plots display the value of the density along a circle of radius 1/3 in the xz-plane, as a function of the polar angle 0 -- tr. The nodal points are clearly visible.
, / 0
0.5
1 t
;0
I 5
k,o .
as would be expected and is also observed in position space. However, at angles of 00 and
[
7 r - 00 with 00 = a r c c o s ( ~ 2 ), the density goes exactly to zero, since this is the angle under which the real and imaginary nodes intersect. This angle is not dependent on Z, but depends weakly on b. It varies between 7r/2 for b = 1/2 and ~ / 2 - a r c s i n ( 1 / x / ~ ) for b ~ e~. The second class of spd-type hybrids that we will treat here are situated in the equatorial plane, i.e. at 0 = ~-/2. They consist of a linear combination of s, px and d~2_y~, orbitals, and have proven useful in describing the bonding in square-planar complexes. Their form (in momentum space) is:
Csp~d~2_y2-ab' (P) = X/'i +la + b (~3s(~[~) -3t-v/a~3p~ (]~) -+- v/b~3dz2-y2 (g)) 18Z 5/2 (x/~ (81p 4 + 2 4) - 622p 2 (12~/bcos(2r
5v/-6))
(~x/1 + a + b) (Z 2 + 9p2) 4 i 432 px/-Z-~ sin 0 cos r (Z 2 - 9p 2)
+
(~x/'l + a + b ) (Z 2 +
9p2) 4
(12)
Note that the imaginary part is essentially the same as in Eqn.(11), but the real part differs in shape. However, the consequence is a qualitatively different shape of both the nodal surfaces for the real part, and the nodal curves for the density. Figs.(8-10) show various aspects of the resulting density contribution for a 3sp~dx2_y2
222 Figure 8. Plots of the nodal surfaces in a sp~dx2_y2 hybrid orbital (Z = 1) in the momentum representation. The left plot shows the surface due to the real part (i.e., s and d contributions) only, whereas the right one combines it with the planar and spherical nodal surfaces characteristic of the imaginary (i.e., p-) component.
orbital, as this is the one used to describe square-planar complexes. Since the real part of Eqn.(12) is a linear combination of spherically symmetric s-contributions (with 2 spherical nodes) and "Rosetta-shaped" d-functions the resulting nodal surface is somewhat "donutshaped" around the y-direction. It is displayed in the left plot of Fig.(8). On the right-hand side we combined this surface with the planar and the spherical node of the imaginary part, which are due to the hybrids' p-contributions. Cuts through the xy-, xz- and yz-planes may serve to clarify the topology further (see Fig.(9)). The features in the two planes that are cut by the donut (xy and yz), are rather similar to the ones encountered earlier for the axial hybrids s P-a'b z t t z2. However, in the plane of the ring (xz) it differs considerably, and shows no intersections. All curves are rather contained within each other. Note also that the yz-plane is in itself a nodal plane of the imaginary part, and that therefore the cuts arising from the real part are nodes of the density. The intersections of the circle and the "outer" curves in the right plot are in fact intersections of 3 nodal surfaces, 2 from the imaginary and one from the real part. The lower right plot of Fig.(9) shows the nodes of the density (i.e., the intersections in 3Dspace). The two closed curves in the y z-plane arise from intersections of the donut-shaped node of the real part with the planar node of the imaginary one, whereas the other curves are intersections of the spherical node of the imaginary part with the donut, close to its "hole". The explicit expressions for the curves displayed in the lower right plot can be derived. They are: Z
p--~(V/4-x/~sin20•
;r
•
(13)
223 Figure 9. Traces of the nodal surfaces in the xy-, xz- and yz- planes for the 8p2dx2_y2 hybrid. It can be seen that while the basic features in two of the planes (xy and yz) are rather similar to the one observed in 8-ad pz b.2 type hybrids, the situation in the xz-plane is completely different. The lower right plot shows the nodal lines that are the intersections of various surfaces displayed in Fig.(8).
9f /'/l(,~ .... ',, -~ t '~ ;~176176
\.._ .....~
x
........./
-0"5t -1
1Y z
0.5
/
(o2
z \ . . . . . .
t,.,
4
x
O.
-0.
-0"5 I -1
r
2
2arcc~
V~sin2 0
;p=
It is rather surprising that such simple linear combinations will produce such complicated topologies in momentum space. Fig.(10) shows a surface plot of a section of the momentum density in the xy-plane, where the density is accumulated (left). The seemingly monotonous distribution exhibits on closer inspection a good deal of fine structure: first, there are the aforementioned "holes" in the vicinity of the nodal lines; secondly, the apparent maximum reveals itself on an enhanced scale (right plot), to be a saddle point that is minimal in the x-direction.
224 2 Figure 10 9 Surface plot (left) of the momentum density corresponding to a spxdz2_y2 hybrid orbital. The section displayed lies in the my-plane. Although the overall features seem to be rather weak, a complicated nodal structure is observed, and the origin is a saddle point (right). The right plot shows the momentum density on the x-axis.
49.25
-0.02
-0.03
-0. Ol
0
0. Ol X 0 . 0 2
0.03
For all momentum densities, the origin of momentum space is necessarily a critical point. This arises from the inversion center and the requirement of continuity. However, the topology at that point may vary considerably for the hybrids considered here. We pointed out already in the previous section that the spa-hybrids do not always exhibit a pair of off-center maxima. For hybrids containing d-functions, the picture is further complicated 9 To obtain a clear idea, it is best to obtain the diagonal elements of the Hessian matrix of the density Hij = 02rr~(p-)/OpiOpj in Cartesian coordinates 9 From their sign it may be inferred what type of critical point is observed. For the sp~dbz2-type hybrids, this yields
(Oq27r~bOq27rrOq271-~b)(2~/~ 11 2V/~ 0p
'op--7'0p
-
V27rr
=
-
r ( 8 a - 33)
'
-
l18a '
4V/~ -
11)
(15)
-
(16)
where r = 46656/Z57r2(1 + a + b) is a common pre-factor. The eigenvalue structure in Eqn.(15) indicates that p = 0 is a minimum in z-direction, whenever b < 2a 2
lla 121 --~+--~
and
a > 11/8,
(17)
These conditions are independent of Z, since the latter determines only the spatial extent, but not the basic topology. From Eqn.(15) we also infer that d-contributions tend to flatten out the minimum at the origin (see also Fig.(6)) by moving the maxima towards higher momenta, whereas
225 Figure 11. Map for the topology of the momentum density at the origin of momentum space p 0, for hybrids containing d-functions. The left plot is for functions of the Spzadbz2-type, the right one for the Spzadb~2_y2-type. The axes are the mixing coefficients a and b, respectively, and the lines drawn separate regions of different topology. =
35
60I
30'
5O
25'
:o
ring
~~t i0_
/
saddle
20 b 15
minimum
rlng
I0
l
5 .0,0
O'
maximum ~~ '
,
'
i
,
J ---~, .... 2
a
~
3
,
.
saddle ~1 . . . .
p-contributions sharpen the minimum, although not deepening it. In the x-direction (or 121 more generally, perpendicular to the polar axis), p = 0 is a maximum for b < - U , i.e. in all cases of practical interest. The left plot in Fig.(ll) shows the regions of different possible topologies at p - 0 in the (a, b) plane. Note that for "reasonable" hybrids, a < 3, and the origin in momentum space is therefore either a saddle point or a maximum. Rings or minima would only occur with very low relative s-contributions. Note that the Laplacian (16) (as the sum of second derivatives) at the origin does not depend on the d-contributions. They merely serve to "rearrange" the density, whereas p-components will make the Laplacian less and less negative. For the other type of d-containing hybrids, the spxdx2_v2 hybrids, we have
op~'
Op~'
Op~ v~r
~-
'
~ (Sa - 33)
11' -11/ (19)
where T is the same as in Eqn.(15). This means, the density is always a maximum in the z-direction at p = 0 for this type of hybrid. In the x-direction (which is the main axis of the hybrid), it is minimal if b<~8a2
22a3 ~ 12124 and
a > 11/8,
(20)
and in the y-direction it is maximal as long as b < 121/24 ~ 5.0417, i.e. up to and including spxads~2_y2 hybrids The Laplacian is the same as for the previous type, which is
226 a trivial consequence of the fact that it depends (for p = 0) only on the radial part of the constituting orbitals. Note that our example (sp~dx2_y2) has a saddle point of the momentum density at the origin. This fact can not be seen at the resolution of Fig.(10), since the saddle is extremely shallow (the second derivative is "only" 11664(5- 2v/g)/Tr 2 ~ 911.4. The right-hand plot in Fig.(11) shows the relevant section in the (a,b) plane with the proper designations for the type of critical point occurring at the origin. Note that a cage-structure is impossible since the hybrid is always concentrated in the xy-plane. Table 1 Moments of the sp a hybrids in both coordinate and momentum spaces. (r ~) n--
2
1 (3+a)Z 2 12 l+a
n--
1
1Z
128 a+2 15 ZTr(l+a)
6+5a
16 Z(3+4a) 45 7r(l+a) ,,,
4 39+7a
3 Z2(l+a)
n--O n--1
z(i+~)
n--2
6z2(t+~)
n--3
30 Z3(l+a)
n--4
240 Z4(l+a)
5a+7
1 2 ~Z
11+7a
12+7a
15 7r(l+a)
1_z4(37+7~) 48
l+a
4. M o m e n t s of the H y b r i d Orbitals
It is a simple matter to derive expressions for the moments of the hybrid orbital densities. In momentum space, the expressions will take the form, f pnrr~(g)dg = f ({a})Z n
(21)
where f ({a}) is a function of the mixing coefficients ( one mixing coefficient for the 8p a hybrids, two mixing coefficients for the sp~d b hybrids, etc.). The analogous expression in position space has an inverse dependence on the atomic charge, Z.
/ r'~pr
" - g ({a}) Z-~
(22)
where g ({a}) is also a function of the mixing coefficients, although different from the f function in Eqn. (21).
227
Complete expressions for the hybrid charge density moments, in both position and momentum space, are given in Tables 1 and 2. The former table contains results for the sp a hybrids while the latter gives results for the spad b hybrids. Table 2 Moments of the spa d b hybrids in both coordinate and momentum spaces.
(p')
(<) 2
n -- --2
n-
405
(15+3b+5a)Z 2 1+aTb
9 105+9b+25a
5 Z2(l+a+b) 16 200a+128b+335 175 ZTr(l+a+b)
1
n-0 8 Z(135+160aq-192b)
1 27+45a+21b
n-1
i575
2 Z(l+a+b) 20a+14b+23
Ir(1-t-a+b)
9 Z2(l+a+b)
1Z 2
n-3
81 70a+42b+85
16 Z3(200a+128b+335) 14175 rr(l+aTb)
n-4
405 238a+126b+303
n-
2
2 Z3(l+a+b) 2
Z4(l+a+b)
1 Z4(lO5-t-9b+25a) 405
l+a+b
Notice that the (r -1) and the (p2) moments have no dependence on the mixing parameters. They are simple functions of Z. This is a feature unique to the case where the hybrid orbitals are constructed from degenerate atomic orbitals. Since hydrogenic functions with the same n quantum number fulfill this condition, the "energy" moments, (r -1) and (p2}, will not depend on how the hybrid orbitals are mixed. The moments in Table 2 also do not depend on what types of d functions are used to construct the hybrids. In other words, the moments do not depend on the m quantum number. For example, a hybrid which uses only dz~ orbitals, will have the same moments as a hybrid which uses only dx2_y2 orbitals. This is, perhaps, an obvious point as the moments we are discussing are radial moments; angular characteristics in the hybrid density are spherically averaged during the integration over the solid angle. To see this more clearly, note the form of the integral (pn) for the case of an sp a hybrid.
(pn) --1
1 +a S (R2,o(P)Yo,o+ V/-aR2,I(P)YI,m)9pn (/~2,0(P)yo, 0 -t- %lraI~2,l(P)Yl,m)dp
(23)
The orthogonality of the spherical harmonics, Y/,m, insures that there will be no mixing of radial terms, Rn,t, in the resulting integral. Since the spherical harmonics are also normalized to unity, simplification of Eqn.(23) will produce the following functional form.
(P'~)spo h y b r i d
--"
1 +1 a
/
+ aR ,l (p)*p'R%I(;)] dp
(24)
228
(25)
l+a
Therefore, the hybrid density moment is simply a properly weighted average of each individual orbital moment. The orthogonality of the spherical harmonics has insured that different radial functions do not mix together. An analogous expression holds for the radial moments of the sp~d b hybrids.
1
(pn ) spadb hybrid -- 1 ~- a Jr b ( (pn ) s ~
-~- a (pn ) p ~
-~- b(pn ) d ~
)
(26)
5. C o n c l u s i o n In 1929, Linus Pauling, together with Boris Podolosky, became the first person to publish the m o m e n t u m representation of the eigenfunctions of a single-particle Coulombic Hamiltonian. Although he did not publish any more work on momentum space concepts, he is nonetheless a pioneer in the field of Momentum Space Quantum Chemistry. In this work, we brought together two of Pauling's contributions to Chemistry. The first, as mentioned above, is his pioneering work in momentum space quantum mechanics. The second is the concept of hybrid orbitals, originally used to understand the strengths and directional characteristics of covalent bonds. Accordingly, we have combined these two areas by looking at the form of some of the most useful hybrid orbitals, in momentum space. Among the more interesting qualities of momentum space hybrids, is the lack of strong directional asymmetry, this being one of the most noticeable characteristics of position space hybrids. Indeed, it is this directional asymmetry which has made hybrid orbitals so useful for describing directional bonds. In momentum space, however, the hybrids are inversion-symmetric and this is shown to have a profound effect on the nature of these orbitals. Another difference is the nodal structure of these atomic contributions to the total density. The hybrid orbitals as we know them, in position space, exhibit nodal surfaces, i.e. two-dimensional subspaces on which the density vanishes. This dimensionality is reduced in momentum space. Here, the nodes are invariably one dimensional, i.e. curves that are formed by the intersection of real and imaginary nodal planes. Finally, (for atoms), the momentum densities corresponding to hybrid orbitals exhibit a few basic extremal features close to the origin. These depend on the weight that is given to s, p and d contributions, and they determine the basic "look" of the density. Outwardly, momentum-space hybrids share one feature with a related experimental quantity, the Compton profile: they all look alike. On closer inspection, however, there are a variety of complex features, mainly arising from the nodal structure of the orbitals. Apart from the obvious use of hybrids in position space for the description of bond situations, there is another feature that has always captured the interest of scientists and laymen: their intricate structure. This feature is less apparent in momentum space, but it is still present. If nothing else, its enjoyment makes a close look at these entities worthwhile.
229 6. Acknowledgments We are grateful to Dr. Jiahu Wang and Minhhuy H6 for fruitful discussions, and to Dr. Zelek Herman for making the Pauling bibliography available to us. REFERENCES 1. Z. S. Herman. Some early (and lasting) contributions of Linus Pauling to quantum mechanics and statistical mechanics. In Molecules in Natural Science and Medicine: An Encomium for Linus Pauling, Z. Maksid and M. Eckert-Maksid, Eds. Ellis Horwood Limited, West Sussex, England, 1991, pp. 179-200. 2. Linus Pauling. The Nature of the Chemical Bond. Cornell University Press, Ithaca, New York, 1948. 3. B. Podolosky and L. Pauling, Phys. Rev. 36 (1929) 109. 4. C.A. Coulson and W.E. Duncanson, Proc. Cambr. Phil. Soc. 37 (1941) 67. 5. A. Rozendaal and E.J. Baerends, Chem. Phys. 95 (1985) 57. 6. D.L. Cooper and S. D. Loades, J. Mol. Struct. 229 (1991) 189. 7. P. Kaijser and V.H. Smith Jr. In Methods and structure in quantum science, J. Calais, O. Goscinski, J. Linderberg, and Y. Ohrn, Eds. Plenum Press, New York, 1976, pp. 417-426. 8. P. Kaijser and V.H. Smith Jr. Evaluation of momentum distributions and Compton profiles for atomic and molecular systems. In Advances in Quantum Chemistry, vol. 10. Academic Press, New York, 1977, pp. 37-76. 9. S.R. Gadre, A.C. Limaye and S.A. Kulkarni, J. Chem. Phys. 94 (1991) 8040. 10. W.M. Westgate, A.M. Simas and V.H. Smith Jr., J. Chem. Phys. 83 (1985) 4054. 11. I.R. Epstein and A.C. Tanner. In Cornpton Scattering, B. Williams, Ed. McGraw-Hill, New York, 1977, pp. 209-233.
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
231
Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
T h e o r y as a v i a b l e p a r t n e r for e x p e r i m e n t The q u e s t for t r i v a l e n t s i l y l i u m ions in s o l u t i o n Carl-Henrik Ottosson,
a
Elfi Kraka,
b
and Dieter Cremer
b*
aDepartment of Organic Chemistry, Chalmers University of Technology, Kemigarden 3, S-41296 G6teborg, Sweden bDepartment of Theoretical Chemistry, University of G6teborg, Kemigarden 3, S-41296 G6teborg, Sweden
1. INTRODUCTION Progress in science is often accelerated by controversy. The protagonists of different views put much effort into convincing their opponents and the scientific community about the correctness of their view. Especially if a discussion becomes heated and personal arguments are involved, then a scientific feud attracts the attention of many bystanders, and opposing parties develop which investigate with increasing force the pro and cons of the scientific arguments. In this way, a scientific problem is often brought to a rapid solution. In chemistry, a well-known example for such a feud was the debate between Winstein and Brown about the classical or non-classical character of carbocations such as the norbornyl cation. [1,2] A similar feud developed in the last decade between Lambert and Olah on the existence of trivalent silylium ions (R3Si +) in solution, which are the silicon analogues of carbenium ions R3C +. [3,4] Silylium ions are easily observed by mass spectrometry as fragments of organosilyl c o m p o u n d s since they are thermodynamically more stable than carbenium ions. [5,6] Accordingly, one expects that silylium ions are readily formed in solution. However, it was found that they coordinate to solvent molecules or counterions so strongly that their silylium ion character is lost. Nevertheless, several promising attempts were made to synthesize R3Si+ ions that may be largely uncoordinated in solution. [716] The quest for trivalent silylium ions in solution triggered enhanced research efforts and, accordingly, it was foreseeable that this problem would be brought quickly to its solution. In the past, the outcome of a scientific debate depended to a large extent on the choice of decisive experiments, support by spectroscopic results, and the combination of many facts and details in an intricate puzzle. Nowadays, experimental observations are often complemented by data obtained from quantum chemical calculations. For example, it is now fully confirmed by
232 accurate ab initio calculations that Winstein was right on the issue of the classical/non-classical carbocation question in the case of the norbornyl cation. [17] In the debate on the existence of a free silylium cation in solution, Olah [1823] used ab initio results and argued that calculated 29Si chemical shifts of silylium ions in the gas phase differ considerably (by 200-300 ppm) from those m e a s u r e d by Lambert. [7-10] Hence, the usage of computers and q u a n t u m chemical programs turns out to be decisive when testing the validity of an argument. In this way, a computer represents a deus ex machina that solves problems not amenable to experiment. Computer calculations based on ab initio theory have become so accurate that they can compete with the most sophisticated experimental measurements or even outdo them. A theoretician is often in a better position to make a firm judgement on a controversial issue. Ab initio theory has made its entrance into chemistry, and a modern chemist has to learn, beside synthesis and the use of spectroscopy, also the art of doing quantum chemical calculations. We report in this article on the vivaciously carried out debate in connection with the nature of silylium ions in condensed phases and the role that quantum chemical investigations played and still play in this debate. It is not so difficult to design a silylium cation, which by strong internal "complexation" should be unable to interact with solvent molecules. Internal complexation may change the nature of the cation in a way that one cannot any longer speak of a silylium cation. Clearly, such a case is of no relevance for the silylium cation problem, but demonstrates that there is a need to specify what is meant by a silylium cation and in particular a free silylium cation in solution. We will approach the question on the existence of a free silylium cation in solution by first clarifying what properties a free silylium cation in solution should have and how these properties can de determined by either experimental or theoretical means. After these basic considerations, we will discuss and evaluate the various attempts of preparing free silylium cations in solution. In particular, we will see that the recent solution of the problem resulted as a natural consequence of the investigations carried out in the last years where it was most important that research on silylium cations was based more and more on a combination of experimental and theoretical means. Hence, research on silylium cations can serve as an ideal example for similar problems in chemistry.
1.1. Why to investigate silylium ions in solution? It is justified to ask why so much attention has been paid to a problem that may be only interesting within Si chemistry. Why did the silylium cation problem lead to dozens of investigations, publications, and several review articles just within the last years? The answer to this question has to be given in three parts. First of all, there is of course the question whether a silyl cation chemistry can be established in solution phase in a similar way as this was done in the case of carbocations. There is a general chemical interest to see how
233 carbocations compare with silyl, germyl, stannyl, etc. cations and to find out about differences and similarities. [24] Secondly, there is a large overlap between the silylium cation problem and the problem of investigating ion solvation in general. Chemists try to understand the various steps in a solvation process, the accompanying changes in the properties of the solute and its consequences for chemical reactivity. Solvent-solute interactions span the broad spectrum from van der Waals type interactions to bonding between the interacting molecules. Specific and nonspecific solvation can occur, one or several solvent shells with complicated structures can be built up around the solute, and new chemical species with a different reactivity can be formed in a solvent-solute interaction. In the case of charged molecules in solution, one has to consider the interaction between ion and counterion, which may lead to tight or solvent-separated ion pairs. Hence, the solvation process has many facets and it is difficult to understand and describe solvation in general, no matter whether experimental or theoretical means are used for the problem. [25] Of course, a solvation process could be investigated much easier if one could enforce a stepwise, controlled transition of a molecule from the gas to a solution phase in the way that first just nonspecific solvent effects are considered and, then, one by one specific interactions with individual solvent molecules are switched on. Such a controlled solvation process implies that the molecule in question must be able to strongly interact with individual solvent molecules and to build up besides specific coordination one or two additional solvation shells, i.e. an ion would be the ideal target for such an investigation. However, an ion interacts strongly with solvent molecules and it seems to be impossible to enforce solvation in a stepwise manner. This is exactly the point where the silylium cation problem comes in. Provided a free silylium cation can be generated in solution by some suitable protecting, solvation-hindering mechanism, then it will be possible to disassemble protection stepwise and, by this, switch on solvation in controlled m a n n e r monitoring at the same time changes in molecular properties due to solvation. On this background, a free silylium cation is an ideal target for investigating the mechanism of solvation. Finally, there is another aspect of the research on silylium cations. The question whether silylium cations are free or coordinated in solution requires to d e t e r m i n e the structure of a solvated molecule. Presently, there is no experimental m e t h o d available that can fulfil this task in a detailed and satisfactory manner. Questions concerning the geometry of the solvated ion, the number of solvent molecules or counterions in contact with the ion, the type of solvent-solute interactions, etc. cannot be answered directly by experiment. The information that is available on silylium ions in solution stems almost exclusively from NMR spectroscopy in form of chemical shift measurements. It is also possible to get additional information from X-ray structural analysis of ion-solvent complexes in the crystal state, however, the assumptions made to extrapolate from the solid state to solution phase can be as large as those used to extrapolate from the gas phase to solution phase. It is also difficult (but not impossible if sufficient computational resources are available) to get an accurate description of the arrangement of solvent molecules
234 around an ion by theoretical means. A distinction between contact ion pairs or solvent separated ion pairs implies a large amount of solvent modelling that cannot be carried out on a routine basis. Out of the dilemma that neither experiment nor theory provide a completely satisfactory description of the properties of solvated molecules, the N M R / a b initio/IGLO method [26] was born, which is an excellent example for the fruitful interplay between theory and experiment: The information available from experiment, in this case NMR chemical shifts, is used in combination with calculated data on the same property to determine the geometry of the compound investigated. This is possible because the calculated NMR chemical shifts are sensitive to the molecular geometry used in the calculation. Hence, agreement between measured and calculated NMR chemical shifts implies that the geometry used in the calculation is the geometry of the molecule under the conditions of the measurement (provided the method used for the NMR chemical shift calculation is accurate enough to make differences between the two sets of NMR chemical shifts meaningful). In this way, theory becomes an extra tool in the hands of the chemist that can be used to determine geometry, electronic nature, chemical bonding and other properties of solvated molecules. The silylium cations in solution represent excellent examples for testing this approach and proving its usefulness for experimental chemists.
1.2. Connection to Pauling's work and scope of the article In his last scientific contribution, Pauling [27] discussed the crystal structure of the triethylsilyl tetrakis(pentafuorophenyl)borate prepared from a toluene solution, which had been published by Lambert and co-workers. [8] These authors had interpreted the crystal structure to confirm the existence of a somewhat nonplanar silylium cation in the presence of a noncoordinating anion and a weakly interacting toluene molecule. In his contribution, Pauling rejected this interpretation by indicating that the smallest Si,C(toluene) distance of the crystal structure indicates bonded interactions between the silylium cation and toluene. Furthermore, he pointed out that the positive charge of the Si atom is significantly distributed over the toluene molecule, which implies that silylium cation character has been (partially) converted into a carbocation by (partial) bonding to toluene. Although not stated explicitly in his contribution, Pauling attacks exactly the basic question of the silylium cation problem, namely to which extent bonding is established between a silylium cation and (a) solvent molecule(s) and how much of its silylium cation nature is lost because of this. Quantum chemical calculations can provide a direct answer to this question and show whether Pauling's arguments are correct. Accordingly, we will discuss the silylium cation problem by focusing on the contribution that Quantum Chemistry can provide in this case. First, we will describe the quantum chemical methods needed for this purpose. Accordingly, Section 2 of this work is devoted to a discussion of the NMR/ab initio/IGLO method and its extension to density functional theory (DFT), namely the NMR/DFT/IGLO method.
235 In Sections 3 and 4, the history of the silylium ion problem and the basis for a systematic description of silylium ions in solution are presented. So far, none of the protagonists of the scientific feud on the nature of silylium ions in solution has tried to clarify what is meant by terms such as silylium ion character, free, nearly free, or coordinated silylium ions, degree of complexation, and the type of interactions between solute and solvent. Of course, this has to do with the fact that depending on the experiments performed just one or two properties of the solvated silylium ions can be measured, and this is not sufficient to support basic definitions. However, calculations provide a variety of different silylium ion properties and, thereby, a basis can be established to give a rather conclusive description of silylium ions in solution. In Sections 5, 6 and 7, three different approaches to the problem of silylium ions in solution are described. First, the typical gas phase versus solution phase ab initio (DFT) description of silyl compounds and silylium ions is given (Section 5). In Section 6, the N M R / a b initio/IGLO and N M R / D F T / I G L O methods are used to investigate solvation of silylium ions in different solvents. This w o r k demonstrates how complex the solvation process of a silylium ion can be and, therefore, there is a need to generate silylium ions under well-defined situations in solution which simplify investigations. Out of this necessity, the idea of intramolecular solvation of silylium ions was born, which is discussed in Section 7. As soon as one realizes that silylium ions fiercely attack solvent molecules or counterions due to their "appetite" for bonding partners, one might give up the idea of uncoordinated silylium ions in solution. However, it will be shown that a thorough understanding on how a silylium ion may be stabilized and prevented from interacting with solvent/counterion molecules, will help to design silyl cationic systems that approach the ideal case of a free silylium ion in solution. Various suggestions on how to generate uncoordinated silylium ions in solution are discussed on the basis of theoretical results in Section 8 and, in Section 9, the final solution to the problem is presented.
2. THE NMR/AB INITIO/IGLO METHOD For experimentalists, NMR spectroscopy is one of the most important tools to investigate a newly synthesized compound. Nowadays, NMR measurements are standard in synthetic chemistry and, therefore, it is c o m m o n that the first experimental information on an u n k n o w n chemical c o m p o u n d is obtained by recording its NMR spectrum. By using measured chemical shifts and coupling constants, and comparing these magnetic properties of the molecule under investigation with those of suitable reference molecules, it is in most cases possible to provide useful structural information on the new compound. However, if a newly synthesized c o m p o u n d contains unusual structural features, for which no useful reference data are available, it will be difficult to derive structural information from the measured NMR spectrum. In such a situation, theory can provide the missing link between measured NMR chemical
236 shifts and molecular structure. This implies that NMR chemical shifts (and spinspin coupling constants) can be calculated with sufficient accuracy. Since the work of Ditchfield in the early 70s, the GIAO (Gauge Including Atomic Orbitals) method for chemical shift calculations is available. [28] However, GIAO in its early version was a computationally expensive and not accurate method that was only applicable to rather small molecules. Therefore, just little progress on q u a n t u m chemical methods for the calculation of NMR chemical shifts was made in the 70s. This changed considerably when in the early 80s Kutzelnigg suggested a new method for chemical shift calculations based on localized MOs. [29] Kutzelnigg together with Schindler worked out this method in a version that was easy to use and that is known today under the name "Individual Gauge for Localized Orbitals" (IGLO) method. [30] With this method, it was possible to reproduce reasonable NMR chemical shifts for a range of different molecules. [31] Kutzelnigg and Schindler used in their early calculations experimental geometries which originated from microwave, infrared, electron diffraction or even X-ray diffraction measurements and, therefore, the calculated NMR chemical shifts were inconsistent depending too much on the accuracy of the experimental geometry used. [30-32] Schleyer was among the first to realize the importance of using accurate geometries in IGLO NMR chemical shift calculations. [33] As was found in benchmark studies, it is necessary that chemical shifts are calculated for quantum chemically determined molecular geometries of high and consistent accuracy. Experimentally, it is almost impossible to obtain geometries for a range of different molecules with the same accuracy. However, this can easily be achieved by calculating geometries with ab initio theory. For example, Reichel and Cremer showed that molecular geometries obtained at the H F / D Z + P level of theory are accurate enough to lead to a consistent evaluation of 13C chemical shifts. [34] Even better results are obtained when the NMR chemical shift calculations are based on MP2, CCSD(T) or DFT geometries obtained with a DZ+P or TZ+2P basis set. It was found that the calculated NMR chemical shifts sensitively depend on the molecular geometry used, [33-38] which is particularly true in the case of 29Si NMR chemical shifts. [39-48] Two examples should illustrate this. Figure 1 shows that the 529Si value of a silylium ion R3Si + depends on the degree of pyramidalization. There is a relatively strong downfield shift for increasing pyramidalization, which is interesting in view of contrary claims m a d e by L a m b e r t and co-workers. [8] These a u t h o r s expected that pyramidalization of silylium ions leads to upfield shifts of the ~29Si value thus explaining the rather low 29Si NMR chemical shifts measured for silylium ions in solution. Figure 1 reveals that there is indeed a strong dependence of 29Si NMR chemical shifts with regard to geometrical distortions, however this dependence is opposite to expectations and does not s u p p o r t premature conclusions made by the experimentalists. In Figure 2, the calculated dependence of the 29Si chemical shift of the (Me3Si)3Si(C6H6) + complex on the Si-C(benzene) interaction distance is shown. There is a linear relationship between shift value and SiC distance that gives an increase of 62 ppm for an elongation of the Si-C distance by 0.1 A. In most cases
237
500
a29Si [ppm]
i!
4SO
!
.114"
--1 +
,R=H
~R=Me
O~
400
350
300 250
I
I
70
80
l
90
I
,
IO0
110
o~ [degree] Figure 1. Dependence of the 29Si shift on the degree of pyramidalization of a silylium ion. (HF-IGLO/[7s6p2d/5s4pld/3slp] calculations from Ref. 41)
829Si [ppm]
25o
i
- - - ~ ' ~
200
--! §
150 H ~ - C ~
100 50 0
~/" f 2.2
,
,
,
,
1 , 2.3
Me3Si.....2 Si~SiMe3 Me3Si~ , ,
z .J . . . . . . 2.4
1
2.5 rsi_c [A]
Figure 2. Dependence of 29Si shift on the Si-C(benzene) bond length in the Wheland c~-complex ( M e 3 S i ) 3 S i ( C 6 H 6 ) + involving the t r i s (trimethylsilyl)silylium cation and benzene. (HF-IGLO/[7s6p2d/5s4pld/3slp] calculations from Ref. 45)
238 NMR chemical shifts can be measured with high accuracy and, therefore, the reliability of IGLO chemical shifts enables the determination of molecular geometries. For example, it was found by Ottosson and Cremer [45] that the experimental shift value of 111 p p m [9] corresponds to a Si-C(benzene) bond length of 2.29 A. Such a distance is indicative of covalent bonding and, therefore, documents the loss of silylium ion character in the benzene complex. The dependence of calculated NMR chemical shifts on the molecular geometry is the basis of the N M R / a b initio/IGLO and NMR/DFT/IGLO methods. Schleyer and co-workers [33] and, independently, Cremer and co-workers [37,38] found in the case of 13C NMR chemical shifts that once an accurate geometry of a molecule was calculated, then in almost all cases measured and calculated NMR shifts did agree. Even though NMR spectroscopy does not provide any direct information on the geometry of a molecule, the latter can be determined by combining calculated and experimental shifts. If the two sets of shift values agree, this will be convincing proof that the geometry used in the calculation is close or identical to the molecular geometry in the situation of the measurement. This has been confirmed in many cases where it is possible to compare with an experimental geometry. One of the most convincing examples of a successful usage of the N M R / a b initio/IGLO method has been given by Cremer and co-workers in the case of the homotropenylium cation, [34,37,38] for which previous investigations had p r o v i d e d rather confusing descriptions of its equilibrium geometry favouring all a classical carbocation structure. [49-51] C o m p a r i s o n of the experimental NMR chemical shifts (measured in solution) and the IGLO shift values strongly suggested a non-classical geometry of the carbocation with a very long C1-C7 distance of 2 A. [37] This fact could later be confirmed by high level ab initio calculations performed at the MP4 level of theory. [37,38, 52] One may argue that the geometry of a non-linear K-atomic molecule which is defined by 3K-6 internal parameters or coordinates can not be described by the isotropic chemical shifts of K atoms. In principle, one could use for each nucleus the measured shielding tensor and gain in this way additional experimental information to fix 3K-6 geometrical parameters of a molecule. However, m e a s u r e m e n t of the shielding tensor is not a routine technique in NMR spectroscopy. One could also use measured NMR coupling constants, but the relationships between coupling constants and geometrical parameters, in particular bond lengths and bond angles, are at the preliminary stage of being explored by theory. [53] Hence, one might criticize the accuracy of a method that is based on maximally K rather than 3K-6 parameters. A l t h o u g h the geometry of a molecule is determined by 3K-6 internal coordinates, only a few of these coordinates are really critical in a specific case. For instance, any ab initio calculation can provide a reasonable estimate of C-H bond lengths or C-C-H bond angles of a hydrocarbon and it is well-known that these parameters do not change very much when the molecule is transferred from gas to solution phase. The same is true for many other geometrical parameters of a molecule, and there are just a few that critically depend on the molecular
239 environment and change for a transition of the molecule from the gas to a condensed phase. Clearly, the most significant change in the molecular geometry of a donoracceptor complex u p o n solvation is given by an increase (decrease) of the interaction distance between donor and acceptor atoms. An instructive example for such a case is the donor-acceptor complex borane monoammoniate, BH3NH3, for which both gas phase and crystal state geometries are available. [54-56] The microwave value of the B-N distance is 1.658 A, [55] while the solid state value is close to 1.6 A. [56] However, experiment does not provide any insight into the geometry of BH3NH3 in aqueous solution. The only data available are the 11B and 15N chemical shifts of the molecule in this medium. [57] Cremer and coworkers calculated IGLO 11B and 15N chemical shifts of BH3NH3 using the HF/631G(d,p) geometry. [38] A large deviation between measured and calculated shift values was found and it was concluded that the difference results from use of the gas phase geometry for calculating the properties of a molecule in solution. When reoptimizing the geometry of the complex under conditions that simulate an aqueous solution, a shortening of the N-B distance from 1.687 to 1.610 A was observed. [38] The IGLO chemical shift values calculated by using this new geometry were in satisfactory agreement with the experimental values, which suggested that the calculated geometry is close to the geometry of BH3NH3 in aqueous solution. [38,54] In connection with the work on BH3NH3, it was observed that changes in the geometry due to solvent effects are limited to just one geometrical parameter. In this connection, one can speak of a "leading parameter principle" since in most cases just a few parameters are significantly altered under the impact of changes in the surrounding medium. The other geometrical parameters adjust smoothly to changes in the leading parameters and, in general, it is straightforward to calculate these adjustments by standard ab initio techniques, once the major changes in the leading parameters have been determined by the N M R / a b initio/IGLO method. Of course, there are other factors than geometry, which influence the values of NMR chemical shifts. For instance, inclusion of electron correlation in NMR chemical shift calculations was found necessary in many cases. [58,59] There are methods such as the GIAO-MP (GIAO at the M~ller-Plesset level) or GIAO-CC methods (GIAO at the Coupled Cluster level), [59] which can handle this problem and lead to very accurate chemical shifts. However, the price one has to pay in terms of computer time is rather high so that these methods can be applied to larger molecules only in exceptional cases. An alternative is provided by the DFTIGLO method since DFT includes a larger amount of dynamic correlation effects in an unspecified, but nevertheless effective way, which is of advantage when calculating NMR chemical shifts. DFT-IGLO in the way recently described by Olsson and Cremer [60,61] leads to very accurate NMR chemical shift values at moderate computational costs. [62] It is based on the sum-over-states density functional perturbation theory (SOS-DFPT)of Malkin and co-workers [63] and the original Hartree-Fock IGLO m e t h o d of Kutzelnigg and Schindler. [30] The Coulomb part of the electron interaction energy is accurately calculated while the
240 exchange-correlation potential is determined by numerical integration. [60] The well-known deficiencies of DFT methods to lead to occupied orbitals with relatively high energies and, accordingly, to an overestimation of paramagnetic contributions to chemical shifts [64] is compensated within SOS-DFPT by adding appropriate level shift factors to orbital energy differences as was first suggested by Malkin and co-workers [65] and studied in detail by Olsson and Cremer. [60] For the DFT shift calculations, a combination of the Becke exchange [66] and the P e r d e w - W a n g (PW91) correlation functionals [67] is used while the geometries needed for the shift calculations are normally determined by the socalled B3LYP method, which is based on Becke's three parameter functional (1) [68,69] (l-a) Ex(Slater) + a Ex(HF) + b Ex(Becke) + c Ec(LYP) + (1-c) Ec(VWN)
(1)
(Ex(Slater): Slater exchange, Ex(HF): HF exchange, Ex(Becke): gradient part of the exchange functional of Becke, Ec(LYP): correlation functional of Lee, Yang, and Parr (LYP), Ec(VWN): correlation functional of Vosko, Wilk, and Nusair; a, b, c: coefficients determined by Becke using the three parameter fit to experimental heats of formation [68]). B3LYP-DFT calculations with a VDZ+P basis such as 631G(d,p) or a more elaborate VTZ+2P+diff+f (diffuse s,p functions and f functions are added to VTZ+2P) basis such as 6-311+G(3df,2pd) lead to rather accurate geometries and energies [70] comparable with MP2 or in some cases even with CCSD(T) correlation corrected results. [62,71] Hence, one can determine the structure of a silylium cation or any other molecule in solution by a N M R / D F T / I G L O approach p r o v i d e d m e a s u r e d NMR chemical shifts are available. While VDZ+P basis sets are sufficient to get reliable geometries for silylium cations, NMR chemical shift calculations with DFT-IGLO (or HF-IGLO) for 29Si, 13C or other nuclei require special basis sets, which have been designed by Kutzelnigg and co-workers [31] for this purpose. Calculations with the ( 1 1 s 7 p 2 d / 9 s 5 p l d / 5 s l p ) [ 7 s 6 p 2 d / 5 s 4 p l d / 3 s l p ] and the (12s8p3d/11s7p2d/6s2p) [8s7p3d/7s6p2d/4s2p] basis set, which are of VTZ+P and VQZ+2P quality, respectively, are particularly reliable. Apart from geometry, correlation, and basis set effects, the consideration of environmental effects in an ab initio or DFT investigation can influence calculated NMR chemical shift values. For example, the general impact of a solvent on a solute can change the NMR chemical shift value of a solvated molecule. [38] Solvent molecules arrange around a solute molecule in such a way that electrostatic interactions lead to m a x i m u m stabilization of the solvated molecule. A solvent with a relatively large dielectric constant will lead to increased charge separation in the molecule, i.e. polar bonds become more polar and bond dipole moments larger. These are effects that require an explicit inclusion of solvent effects in the geometry optimization, but also sometimes in NMR chemical shift calculations. For this purpose, the PISA continuum model developed by Tomasi and coworkers [72] can be utilized as was shown by Cremer and co-workers. [34,38,41-47]
241 In this approach, the solute is embedded in a polarizable continuum, which has the same dielectric constant as the solvent in question. The cavity, into which a solute molecule is placed, is constructed by centring at each atom of the solute molecule a sphere that depends on its van der Waals radius. In this way, a surface is obtained that can be corrected for unphysical situations with sharp turns between two spheres by smoothing the surface with smaller spheres that fit into the gap between two larger spheres. When a cavity with a smooth surface has been determined, point charges are distributed on its surface utilizing the electric field from the isolated molecule. The magnitude of the charges is determined by the dielectric constant of the medium. The charge distribution in the solute influences the charge distribution on the cavity surface and vice versa, and therefore, an iterative procedure has to be applied to get the correct charge distribution for the soluted molecule and the cavity surface. Within the PISA continuum model one can calculate electrostatic, dispersion, exchange repulsion and cavity energies. However, since the latter terms are included in a more empirical rather than a direct quantum mechanical way and, in addition, do not contribute significantly to changes in geometry, mostly, just the electrostatic part of the solvation energy is considered w h e n describing solvation effects in a particular solvent characterized by dielectric constant r In this way, the N M R / a b initio/IGLO or N M R / D F T / I G L O method can be based on PISA solvent rather than gas phase geometries. Alternatively, reaction field calculations with the IPCM (isodensity surface polarized continuum model) [73,74] can be performed to model solvent effects. In this approach, an isodensity surface defined by a value of 0.0004 a.u. of the total electron density distribution is calculated at the level of theory employed. Such an isodensity surface has been found to define rather accurately the volume of a molecule [75] and, therefore, it should also define a reasonable cavity for the soluted molecule within the polarizable c o n t i n u u m where the cavity can iteratively be adjusted w h e n improving wavefunction and electron density distribution during a self consistent field (SCF) calculation at the HF or DFT level. The IPCM method has also the advantage that geometry optimization of the solute molecule is easier than for the PISA model and, apart from this, electron correlation effects can be i n c l u d e d into the IPCM calculation. For the investigation of Si compounds (either neutral or ionic) in solution both the PISA and IPCM methods have been used. [41-47] Problematic for m a n y ab initio calculations are computational cost and this often limits the size of the molecules that can be investigated. The cost of a SCF calculation is proportional to O(N4/8) where N is the number of basis functions. In a conventional SCF calculation the two-electron integrals are stored on disk, however this quickly becomes problematic when the molecular size increases. Therefore, one uses for large molecules a Direct SCF (DSCF) algorithm in which the two-electron integrals are no longer stored on disk after their first calculation, but recalculated in each SCF iteration. [76] In fact, the DSCF method is faster than conventional SCF for systems with more than ~100 basis functions. In IGLO calculations, relatively large basis sets must normally be used to get reliable results. With increasing size of the basis set and increasing size of the molecule,
242 limited disk space makes IGLO calculations very difficult if not to say impossible. Again, a solution to this problem is found by using a direct rather than the conventional algorithm for IGLO (DIGLO) NMR chemical shift calculations. [42,77] By using DIGLO, NMR chemical shift calculations with as many as 1000 basis functions are no longer a problem.
3. THE SILYLIUM ION PROBLEM Silylium ions R3Si + are known to be more stable in the gas phase than their carbon analogues, which is due to the fact that silicon is more electropositive than carbon. It is easy to observe them by mass spectrometry or ion cyclotron resonance spectroscopy of triorganosilyl compounds. [5] However, their existence in solvent phases is difficult to prove and, therefore, is still a matter of dispute. Early attempts to synthesize silylium ions in solution with methods successful in the preparation of carbocations were not successful. [78] The lack of stability of R3Si + in solution was attributed to the strong tendency of Si to form hypercoordinated compounds, in which Si has five or even six bonding partners. [79] During the 80s and early 90s, Lambert and co-workers reported a number of experiments, which, according to the authors, suggested the existence of free silylium ions in solution. [80-84] The alleged silylium ions were prepared by hydride abstraction from an alkylsilane with triphenylcarbenium perchlorate according to reaction (2) first suggested by Corey. [85] R3SiH + Ph3C+C104 - ---->
R3Si+C104 - + Ph3CH
(2)
The experiments involved compounds such as Me3SiC104, Ph3SiC104, PhMe2SiC104, MePh2SiC104, (MeS)3SiC104, (EtS)3SiC104, and (i-PrS)3SiC104 in solvents ranging from sulfolane, dichloromethane, 1,2-dichloroethane to acetonitrile. [80-84] Evidence for the existence of free silylium ions came from conductance measurements and 35C1/37C1 NMR spectroscopy. However, Lambert's results were questioned by several authors, in particular Olah. [1820,86] Olah criticized the methods used to dry solvents and argued that residual water impurities in the solvents would cause hydrolization of R3SiC104 to form free perchlorate ions, which would explain the 35C1/37C1 NMR observations. Strong evidence against the alleged observation of free silylium ions also came from 29Si NMR measurements on trimethylsilyl perchlorate in sulfolane, which revealed no dependence of ~29Si on the concentration of the perchlorate. Lambert rejected this criticism by pointing out that water was present in the solvents used at a molarity one order of magnitude below that of the observed silylium ions. [84] Apart from this, Lambert and co-workers improved experiments and published new evidence for free or nearly free silylium ions in solution. [7-10] Additional evidence supporting Lambert's interpretation of experimental results came from Reed and co-workers. [11-16]
243 Even though experiments could not give a clear answer to the question whether free or nearly free silylium ions exist in solution, they revealed some major differences between silylium and carbenium ions. The latter possess similar stabilities in gas and solution phases. [87] 13C NMR chemical shifts for carbocations calculated with ab initio methods for the gas phase agree remarkably well with those obtained experimentally for carbocations in solution phases. [88] It seems that counterion and solvents such as SO2F2, SO2C1F used in connection with FSO3H, SbF5, etc. do not change the properties of carbocations much and, therefore, medium effects can often be neglected when discussing the properties of carbocations in solution. For silylium ions, one cannot expect a similar independence of the medium, since Si easily extends its coordination sphere by forming hypercoordinated compounds with five or six ligands. [79] In solution, nucleophilic solvent molecules S (or counterions X-) will be suitable coordination partners (Scheme 1).
a.,.
a
R ~ " ~ S!i ~
i +
s { RR ~ '~, " S i - R !
s
s, '.
i +
R ~R ~"" S~ ' i-R .
s
Scheme 1 Even a weakly associated S molecule will transfer some negative charge to Si thus leading to changes in its electronic structure, geometry, and magnetic properties. For example, charge transfer will shield the Si nucleus and, as a consequence, 29Si NMR chemical shifts untypical of a silylium ion will result. Therefore, it is the question how much silylium ion character is retained in solution and whether it is still justified to speak of silylium ions. In view of the many controversial experimental observations that were reported, an investigation of possible interactions of R3Si + with solvent molecules S was needed. Considering the many limitations of experiment, it was clear that such a description had to be given by ab initio calculations. [41] Of course, the first step of such an investigation was to determine the properties of silylium ions in the gas phase. These properties represent appropriate reference data that have to be considered when assessing the degree of silylium ion character retained in solvated silyl cations.
3.1. Properties of silylium ions in the gas phase Some calculated properties of silylium cations R3Si+ (R = H or CH3, 1 - 4 ) are listed in Table 1. Figure 3 gives the geometry of H3Si + ( 1 ) a n d Me3Si + (4) calculated both at the HF, MP2, and DFT-B3LYP level of theory. Silylium ions possess planar equilibrium geometries, which is in line with the geometries of carbenium ions. [89] Geometries calculated at the HF, MP2 or DFT level with
244 DZ+P or better basis sets are similar suggesting that these methods give a reasonable description of geometrical parameters. [6,39-48] Table 1 DFT/IGLO NMR chemical shifts, 3p~(Si) populations and Si charges of silylium cations 1 - 4. a #
Molecule
Sym
1 2 3 4
Sill3 + SiH2CH3 + SiH(CH3)2 § Si(CH3)3 +
D3h Cs C2v
C3h
a NMR chemical shifts in [8s7p3d/7s6p2d/4s2p] basis. The positive charge of the Si atom Mulliken charges were obtained 48.
~29Si 320.7 350.3 365.3 380.9
3p~(Si) 0 116 132 196
+q(Si) 760 790 835 885
p p m relative to TMS obtained with the electron population of the 3p~(Si) orbital and the (+q) are given in melectron. Geometries and at the B3LYP/6-31G(d) level of theory. See Ref.
Figure 3. Geometries of H3Si + (1) and Me3Si + (4) at the MP2/6-311G(d,p) and B3LYP/6-31G(d) level of theory. [47,48] B3LYP values in italics. Bond lengths in A, angles in degree. An alkyl group stabilizes a silylium ion more than a neutral Si-compound, which is a result of stronger hyperconjugative interactions between Si + and the alkyl group. The hyperconjugative stabilization energy caused by alkyl groups in neutral R3SiX compounds is less than 10 kcal/mol, which is because of the poor overlap between 3p~(Si) and 2p~(C) orbitals. [41,47] However, for silylium ions, the 3p~(Si) orbital contracts, the SiC bond length becomes shorter and, hence, the overlap between 3p~(Si) and 2p~(C) orbitals is increased. Even more important,
245 the 3p~(Si) orbital in silylium ions is a low lying unoccupied orbital and, therefore, 2-electron stabilizing interactions are much larger than in neutral silyl compounds. Accordingly, the trimethyl stabilization energy of Me3Si + is 37 k c a l / m o l (using H3Si + as a reference) and, therefore, trialkylated silylium ions R3Si + are less electrophilic than the parent silyl cation H3Si +. [46] The calculated 29Si NMR shift of I is 321 ppm. With successive methylation in the series I - 4 it increases to 381 p p m (Table 1), which is the result of three different electronic effects. First, hyperconjugation between the methyl groups and the Si + center leads to some charge transfer from the methyl groups into the 3p~(Si) orbital as is indicated by the calculated 3p~(Si) population listed in Table 1. This is in line with the reduced electrophilicity of methyl substituted silylium ions indicated by an increased LUMO energy. Second, the inductive effect of the methyl groups (acting via the ~-orbitals) is more electron withdrawing than that of a H atom. By this, the total positive charge of the Si atom increases in the series I to 4 (Table 1) so that the Si nucleus becomes more deshielded and the 29Si NMR signal is downfield shifted. The downfield shift of ~29Si is enhanced by significant paramagnetic contributions, which depend on ~-~ excitation energies and can be estimated from energy differences between occupied and virtual MOs. Although the HOMO-LUMO energy difference increases when H atoms in 1 are replaced by methyl groups, differences involving next-to-LUMO and next-to-HOMO orbitals decrease, which is mostly indicative of an increase of paramagnetic contributions to NMR shieldings. In solution, solvent molecules will arrange around a silylium cation in a way that their electron-rich parts are positioned opposite to the empty 3p~(Si) orbital. This will lead to a change in both diamagnetic and paramagnetic contributions to the ~29Si NMR chemical shift. Depending on the dielectric constant used in these calculations, the IGLO-PISA method predicts that the 29Si shifts are upfield by merely 10 - 20 ppm. [41] Clearly, a geometrical distortion of a molecule will alter its NMR chemical shifts. Lambert argued that a silylium ion in solution easily pyramidalizes because of steric a n d / o r environmental effects, which could cause the 29Si shift to move upfield. [8] Since Lambert did not observe any 29Si shift value more downfield than 111 ppm, his reasoning could explain the large deviation of more than 200 p p m between measured and calculated 29Si shifts. However, when a silylium ion is distorted from a planar to a pyramidal geometry with R-Si-R angles of 109.5 ~ as shown in Scheme 2, the IGLO 29Si shift moves in the downfield rather than the upfield direction (for a change in the pyramidalization angle from 90 to 109.5 ~ A529Si(1) = 90 ppm and A~29Si(4) = 40 ppm), as shown in Figure 1. [41] These downfield shifts are due to paramagnetic currents in the plane of the o-bonds caused by ~-~ excitations. When the silylium ion is pyramidalized, the p~-orbital mixes with o-orbitals and obtains partial s-character. The energies of the RSi bonding orbitals are raised, that of the LUMO is lowered, ~-~ excitations facilitated, and paramagnetic currents enhanced, which leads to the calculated downfield shifts of the ~29Si value.
246 Upon pyramidalization, the energy increases by as much as 24 (1) and 21 (4) kcal/mol, which is considerably larger than what can be accounted for by crystal packing effects. Furthermore, both the increase in energy and the downfield shifts remain when energies and 29Si shifts are calculated within a polarizable solvent continuum. [41] Accordingly, the statement made by Lambert and co-workers that environmental geometry distortions lead to upfield changes in the 29Si NMR
91 + R~''---G'/R''' ...... Si" ' R
~
120.0 ~
R , ~ ~ ~ R
I+ R
109.5 ~
Scheme 2 chemical shift, [8] have to be rejected. Instead, pyramidalization leads to a considerable energy increase and a downfield shift of 829Si. It is unlikely that external effects are able to distort silylium ions from their planar equilibrium geometries to any larger extent. The experimentally determined 829Si NMR chemical shifts of silyl cations in solution, which are below 115 ppm, [16] clearly indicate coordinated silyl cationic complexes.
SILYLIUM AND CARBENIUM IONS IN SOLUTION. INTERACTION OF SOLVENTS AND COUNTERIONS The interaction between a solute and the solvent can be either specific or nonspecific. In the first case, a solute molecule interacts strongly with one or more solvent molecules so that well-defined complexes are formed, whereas in the second case, one or more solvent shells are formed around the solute molecule. [90] The latter interacts with the molecules of the solvent shell by electrostatic, induction, dispersion or exchange repulsion forces, however not by bonding forces as is possible in the case of specific solvation. Nevertheless, non-specific solvent interactions can alter the chemical character of a species, as for instance its geometry, and the solvated molecule can be forced to dissociate into ions. Even if dissociation does not occur, the geometry of the solute can change significantly. It has been shown that donor-acceptor complexes AB, composed of neutral fragments A and B, normally reduce the A-B bond length in polar solvents as observed previously for BH3NH3. [38,91]
4.1. Definition of a nearly free silylium ion R3Si + in solution One might expect that weakly nucleophilic solvents such as alkanes or halogenated alkanes should not interact with silylium ions in solution and, therefore, represent solvents that retain the silylium ion character of ions R3Si +.
247 However, calculated data on R3Si(S) + complexes with S being a very weakly coordinating solvent clarify that this assumption is false as shown in Table 2. Even in liquid noble gases such as argon and neon, there w o u l d be some interaction between the positive Si atom and the solvent S so that the 29Si shift of ion R3Si(S) + differs from that of a free R3Si +. [6] Already in liquid methane, interactions w o u l d be so large that the shift value changes by 22 p p m (Table 2). Only in liquid helium would it be possible to generate cationic species that can be termed "free silylium ions" because their properties w o u l d be similar (within 1%) to those determined for the gas phase. Table 2 Selected data for complexes Me3Si(S) +. a Solvent S none He Ne Ar CH4 HC1
Closest Rsi-s -
AE 0
A(329Si ) 0
Reference 41,6
3.224 2.490 3.082
0.4 4.4 2.6
-1.7 -35.7 -71.7
6 6 6
3.379 2.545
3.4 9.3
-21.8 -192.7
44 41
.,,
a From HF/6-31G(d) and MP2(FC)/6-31G(d) (values in italics) calculations. Bond distances RSi-S are given in ~, complexation energies AE in kcal/mol, and 29Si NMR chemical shifts in ppm relative to TMS. Gas phase shielding values for the Si nucleus are 355.9 (Ref. 41) and 346.7 ppm (Ref. 6), respectively. Complexation energies are calculated for the reaction Me3Si(S)n+ --~ Me3Si+ + n S. IGLO NMR shifts are calculated at IGLO/ [7s6p2d/Ss4pld/3slp]//HF/6-31G(d) except for values in italics which originate from IGLO/ [7s6p2d/5s4pld/3slp]//MP2(FC)/6-31G(d) calculations. From the results presented in Table 2 one could get the impression that a completely free silylium ion in solution is an illusion that can never be realized. For all silylium ions that are generated in solution, one can expect some kind of coordination with solvent (counterion) molecules and partial or complete loss of the silylium ion character. Therefore, the actual problem seems to be just the question how the degree of remaining silylium ion character can be determined for R3Si(S)n + ions. Inspection of Table 2 reveals that properties such as Si-S interaction distance, coordination energy or 29Si chemical shift do not change i n the same w a y and that the 29Si chemical shift is clearly more sensitive to any solvent-solute interaction than either energy or geometry. Therefore, one has to a p p r o a c h the question of the silylium ion character in a more f u n d a m e n t a l manner. The H3Si + cation in the gas phase is characterized by an empty 3p~(Si) orbital that in this case is identical with the LUMO. As soon as the H atoms are replaced by alkyl groups the 3p~(Si) orbital mixes with some of the occupied MOs and, therefore, becomes partially occupied (see Table 1). H o w e v e r , for all alkyl substituted silylium ions the LUMO is still dominated by the 3p~(Si) orbital. In
248 this respect, the availability of the 3p~(Si +) orbital for nucleophilic interaction partners is of central importance and the population of this AO is a natural measure for the degree of silylium ion character. One might argue that the charge at the central Si atom is also a suitable measure for the silylium ion character. However, the positive charge at Si does not always parallel the occupation of the 3p~(Si +) AO. For instance, strongly oelectron donating substituents Y attached to Si + reduce the positive charge, however do not alter the population of the 3p~(Si +) orbital. Thus, a cation Y3Si + with such substituents could still be regarded as a silylium ion despite less positive charge at the Si atom. On the other hand, if Y is a o-withdrawing group with ~-type lone-pair electrons (e.g. NH2), then Y could inductively increase the positive charge at Si while the lone-pair orbitals on Y donate negative charge to the 3p~(Si +) orbital filling it up in this way. This scenario would reduce the availability of the LUMO, increase the stability of the silylium ion and, thus, lower its electrophilicity. Although Y3Si + in the latter situation might possess a full positive charge at Si due to inductive withdrawal by Y, the silylium ion character, i.e. the trivalency of the Si-atom and the availability of the 3p~(Si +) AO, would be lost. A silylium ion Y3Si+ with Y = NR2, OR, etc. possesses partial SiY double bonds (Scheme 3). For example, the carbocation analogues of ions 5 and 6 in Scheme 3 are considered to possess just little trivalent carbenium ion character. [87a] An amino substituent completely changes the nature of the LUMO of a silylium cation. For the silaguanidinium cation, (NH2)3Si + (5), the next-LUMO rather than the LUMO itself possesses the expected 3p~(Si) character. Moreover, the energies of both LUMO and next-LUMO are high, and therefore, its electrophilicity is much lower than that of an alkylsilylium cation.
NH2
I N~Si +
H2
+ NH2
-.--.-
~NH2
~Si ~ H2N ~NH?
OH
.o
r
NH2
II
H2N ~
+OH
.o
I Si
~NH2 +
NH2 ~
I
~f.Si H2N ~NH2 +
OH
\o.
.o
% +.
OH
+
\o.
Scheme 3 Frenking and co-workers proposed 5 to represent a true trivalent silylium ion in condensed phases. [92] For this cation, the effect of inductive electron withdrawal and conjugative backdonation by substituents Y in Y3Si + is nicely illustrated by calculated charges (see Table 3): The total charge at the Si atom is
249
v e r y positive (+2.19 e) while charges at the N a t o m s are n e g a t i v e (-1.34 e). F r e n k i n g an co-workers a r g u e d that the Si-N b o n d s possess h i g h electrostatic character, a n d as a proof for this hypothesis, the fact w a s m e n t i o n e d that the barrier for NH2 rotation is calculated to be just 5.5 k c a l / m o l . At the same time, the authors write that "there is a considerable re-back-donation from the nitrogen lone-pair into the formally e m p t y p(rc) orbital at Si". [92] Since n-back d o n a t i o n should lead to partial double b o n d character of the SiN bonds and, by this, also to an increase of the b a r r i e r for NH2 r o t a t i o n , a r g u m e n t s are c o n t r a d i c t o r y indicating that the true electronic nature of 5 is probably not correctly described. Table 3 Selected computational data for Y3Si + and Y3C+ cations, a Sym
AE
~i29Si
p[prr
q(Si+/C +)
1, H3Si +
D3h
-
270.3
0.01
1.44
4, Me3Si +
C3h
36.6
355.9
0.14
2.00
5, (NH2)3Si +
D3
59.9
40.0
0.48
2.19
7, PhH2Si +
C2v
25.1 b
200.3
0.29
1.40
8, H3C +
D3h
-
373.2
0.01
0.41
9, Me3C +
C3h
72.5
369.4
0.40
0.59
159.4
0.80
0.71
10, (NH2)3C +
D3h
144.6
11, PhH2C +
C2v
74.9 b
0.45
(a) AE is the stabilization in kcal/mol by Y3 substitution compared to the parent silyl or methyl cation given by the reaction H3M + + Y3MH ~ H4M + Y3M + (where M is either Si or C) and calculated at the MP2/6-311G(d,p) level. The 629Si values are calculated at the IGLO/[7s6d2p/5s4dlp/3slp]//I-IF/6-31G(d) level. Natural populations p of the pn atomic orbital of the central Si or C atom and NPA charges q are derived from the MP2/6-311G(d,p) density. Me = methyl, Ph = phenyl. (b) AE calculated at B3LYP/6-31G(d) level.
H
r
H......N
..H
I
+
/
H
H--NN
I
+
HH
i+
~t
%--N,...... ._N "-../
. - - N \/
H
H
5a, D3
5b, C2v Scheme 4
.,
/ 5c, C3h
N
250 We note that the barrier for NH2 rotation in 5 is an inappropriate measure of the SiN ~-bond strength. When one NH2 group rotates the other NH2 groups can compensate for the loss in ~ conjugative stabilization and, thereby, reduce the rotational barrier. In the equilibrium geometry 5a (Scheme 4) the NH2 groups are slightly rotated out of the heavy atom plane because of steric repulsion. Upon rotation of one of the NH2 groups by 90 ~ (form 5b, Scheme 4), the other NH2 groups can arrange in this plane thus increasing their conjugative interactions with the Si atom. In addition, there are stabilizing hyperconjugative interactions between the occupied pseudo-~(NH2) orbital of the rotated NH2 group and the empty 3p~(Si +) orbital in 5b as well as stabilizing anomeric interactions between the lp(N) orbital of the rotated NH2 group and the o*(Si-N) orbitals. Hence, the barrier for rotation of a NH2 group is significantly reduced where this reduction has little to do with the nature of the SiN bonds in 5a. To obtain a better estimate of the Si-N n-bond strength in 5a, the three NI--I2 groups should be rotated simultaneously into form 5r which is 45.2 kcal/mol less stable than 5a (MP2/6-311G(d,p) calculations). [93] This suggests a value of 15.1 kcal/mol as a lower bound for the ~-bond strength of a single SiN bond in 5a. The barrier for NH rotation in H2Si=NH was calculated to be 46.2 k c a l / m o l (MCSCF/6-31G(d) calculations [94]) and, therefore, it can be concluded that the SiN bonds in 5a possess at least 30% ~ character. Hence, the silaiminium resonance structures of Scheme 3 dominate the w a v e f u n c t i o n of 5a; its aminomethylsilyl cation nature is modest and one can no longer speak of a true silylium cation. This is in line with the fact that the guanidinium ion (10, Table 3) possesses just little trivalent carbenium ion character. Olah, Prakash, and Sommer write that "the aminomethyl cation structures in acid salts of amines, amidines, and guanidines are small". [87a] Amino substitution completely changes the nature of the LUMO of a silyl cation as can be seen from the MO drawings shown in Figure 4. For 5, the nextLUMO rather than the LUMO itself possesses the expected 3p~(Si +) character. The energies of both LUMO and next-LUMO are relatively high compared to those of an alkylsilylium cation, and therefore, its electrophilicity will be considerably reduced. These observations are analogous to the corresponding carbenium ions: for H3C + (8) and Me3C + (9), the LUMO is dominated by a 2p~(C +) orbital whereas in (NH2)3C + ( 1 0 ) t h e next-LUMO rather than the LUMO possesses 2p~(C +) orbital character. Since the LUMOs in amino substituted silyl and carbocations are rather diffuse and of relatively high energy, these ions possess reduced electrophilicity compared to normal silylium cations. To search for Ncontaining substituents that can interact with the 3p~(Si +) AO even stronger than NH2 may be chemically interesting, but is pointless with respect to the silylium ion debate. It is not the goal of the silylium ion debate to find conjugatively stabilized Y3Si + ions since the latter do not any longer represent true silylium ions. Accordingly, the degree of coordination of these species in solution is also not central to the issue of silylium ions in solution. Instead, it has to be clarified whether Si analogues of alkyl or aryl substituted carbenium ions can be generated
251
Figure 4. LUMO of Y3Si+ ions with Y = H, CH3 and NH2. For the LUMO of Me3Si +, contributions from the pseudo-~ orbitals of the methyl groups are not shown. in solution. Of course, hyperconjugative effects of an alkyl group also lead to partial population of the 3p~(Si) orbital. However, these effects are smaller than in the corresponding carbenium ions (Table 3) and do not change the nature of the silylium cations. Hence, R3Si + ions with R equal to alkyl or other saturated groups should be considered as true silylium ions. The situation is different with aryl groups which also can stabilize silyl cations conjugatively and, by this, significantly reduce their silylium ion character (Table 3). However, if steric effects force an aryl substituent to rotate out of the local plane of the Si atom, then conjugation involving the 3p~(Si) orbital will be diminished, the aryl group will act more like an alkyl substituent, and the silylium ion character of the cation in question will largely be retained. The silylium ion character is best assessed by the availability and population of the 3p~(Si) orbital: A silylium ion is a cation, for which 1) the LUMO is dominated by the 3p~ AO of Si and 2) the electron population of the 3p~(Si) orbital is less than 30%. Clearly, 1) and 2) depend on each other and both define the electrophilicity of a silylium cation, which determines the chemical nature of silylium cations. Although 1) and 2) are rather clear, they are of limited practical use in so far as they can only be tested with the help of quantum chemical calculations and even in this case the application of criteria 1) and 2) may be problematic. For example,
252 orbital energies can be related to ionization potentials and electron affinities via Koopmans" theorem, however this relationship is more of qualitative than quantitative nature. In addition, calculated orbital energies change considerably with the method (HF, MCSCF or DFT) and the basis set used. The population of an AO is not a measurable property since AOs or orbitals in general are not observable quantities. Also (and for the same reason), the calculation of AO populations is not an unique procedure and, therefore, can be criticised in many ways. To calculate AO populations, one can use Mulliken or natural orbital populations, effective charges derived from the electrostatic potential of a molecule, or charges derived from a virial partitioning approach. Kraka has described advantages and disadvantages of different charge definitions, [95] however the present work will not further indulge into this problem. One can argue that for the application of 1) and 2) only trends in orbital nature and orbital populations matter so that any quantum chemical method (HF or DFT) or any way of calculating charges (Mulliken or natural orbital populations) suffices for this purpose as long as it is applied in a consistent way. One can further argue that the silylium cation character of a given silyl ion has only to be determined for the gas phase. Once this has been done, other molecular properties of the ion in question can be calculated and compared with the corresponding measured values for either gas or solution phase. In this way, a purely theoretical definition of the silylium ion character could be adjusted and extended to measurable quantities. If a silylium ion coordinates with solvent molecules S or counterions X-, negative charge will be transferred from a lone pair orbital or a bonding orbital to the empty 3p~(Si +) orbital, which thereby becomes partially or completely filled. Significant changes in energy, geometry, and bonding occur. In most cases, one has to speak of hypercoordinated siliconium ions since covalent bonds are established between Si and S (or X-). [79] Clearly, these cases are also not important in connection with the silylium ion debate. More interesting are those cases, in which the interactions between R3Si + and S ( a n d / o r X-) are just of the non-specific solvation type. Non-specific solvation should lead to a relatively small transfer of negative charge to the 3p~(Si +) orbital. Accordingly, the properties of the silylium ion should not change very much relative to the corresponding gas phase properties. To check this a number of molecular parameters of R3Si(S)n + in solvent S can be investigated : a) the binding energy aE between R3Si+ and S or X-, b) the distance between Si + and the nearest atom of solvent S or counterion X-, c) the electron population of the 3p~ orbital of Si +, d) the ~29Si NMR chemical shift value, and e) the electron density distribution between Si + and the closest atom of the S (or X-) molecule. Apart from this, one could use the degree of pyramidalization at the central Si atom or geometrical distortions of solvent molecule(s) S or counterion(s) Xinteracting with R3Si + in solution. However, direct determination of geometry and stability of an ion R3Si(S) + in solution by experimental means is difficult. In
253 special cases, it was possible to prepare crystals from R3Si(S)n + X(S)m- compounds in solution and to determine their structures by X-ray diffraction analysis. However, in the solid state interactions between silylium cations and S or Xmolecules are different from those in solution and, therefore, this approach provides only indirect information on the situation in solution. This is also true in the case of mass spectrometric measurements of binding energies aE in the gas phase. These data provide additional insight into the interactions between R3Si + and molecules S, however again these data do not directly relate to the situation in solution phase. The only property in the list a) to e) that can relatively easily be measured for a silylium cation in solution is the 29Si NMR chemical shift. This quantity is sensitive to any changes in electronic structure, Si bonding a n d / o r molecular geometry caused by the transition from gas to solution phase. By comparing the 29Si shift of an assumed silylium ion in solution with the corresponding gas phase value (calculated for example with the IGLO/HF or IGLO/DFT method), one can draw conclusions on the degree of silylium ion character retained in solution. Of course, for none of the properties covered by the list a) to e) it is guaranteed that changes in their values are linearly related to changes in the degree of silylium ion character. For example, ab initio investigations indicate that binding energies zxE rapidly decrease with increasing interaction distances to values less than 5 kcal/mol, which are typical of van der Waals complexes while the Si-S distances are still much smaller than the sum of the corresponding van der Waals radii. At the same time, 529Si values suggest a strong reduction in the silylium ion character. Obviously, the shift value is very sensitive to small changes in the electronic structure of a silylium cation. Depending on which of the parameters is used, one could consider one and the same silylium ion to be nearly free or weakly bonded. In view of these considerations and the fact that NMR chemical shifts can be both measured and calculated for silylium cations in solution, we focus on 829Si shifts as suitable indicators for silylium cation character. This can be done by s i m p l y d e t e r m i n i n g the shift difference a = 529Si(R3Si +, gas p h a s e ) 829Si(R3Si(S)n +, solution phase) and using it as an indicator for the degree of interaction between silylium ions and solvents molecules S. The value of 529Si(R3Si +, gas phase) is determined by quantum chemical calculations while 829Si(R3Si(S)n +, solution phase) is a measured value. However, the latter value should also be calculated by an appropriate quantum chemical method since only in this way the reliability of the gas phase value (and the method used) can be verified. Alternatively, one can define the percentual silylium ion character according to Eq. (3) %[R3Si§ = S29Si[R3SiH] - S29Si[R3Si(S)§ S29Si[R3SiH] - S29Si[R3Si§
* 100
(3)
254 where R3SiH is the parent silane. For both ways, practise has to show which changes in 29Si NMR chemical shifts indicate a free (nearly free) silylium cation in solution or a loss of silylium ion character. The approach just described can also be applied to carbenium ions R3C +, its usefulness can be tested, and at the same time a better understanding of the electronic nature of silylium cations can be provided by comparing them with the corresponding carbenium ions. For this purpose, we will shortly discuss in the next section the nature of carbenium ions in solution.
4.2. Carbenium ions R3C + in solution
As a result of the pioneering work of Olah, alkyl substituted carbenium ions can easily be generated as relatively long-lived ions in superacid solution. [87] A superacid possesses a higher acidity than 100% sulphuric acid. [96] One of the most used superacids to generate carbenium ion is "magic acid", which is made by mixing the Lewis acid SbF5 and the Bronsted acid FSO3H. The counterion in this system, SbF6-, aggregates with one, two or even more SbF5 molecules to form anionic clusters like Sb2F11- and Sb3F16-, [97] in which the negative charge is completely delocalized. In connection with superacids, one normally uses solvents of low nucleophilicity such as SO2, SO2C1F and SO2F2. [87]The formation of carbocations in these environments can proceed according to several different routes as exemplified in reactions (4), (5), and (6). RCH=CH 2
HSO3F:SbF5 HF:SbF5
+ RCHCH 3
(4)
(CH3)3C+
(5)
~
(6)
HSO3F:SbF5 (CH3)3COH
(CH3)3CF + 2SbF5
-~
(CH3)3C+ + Sb2F11"
Silylium ions cannot be formed under superacid conditions because of the high affinity of Si to O, F, and C1 atoms normally contained in these media. [78] All available experimental results [87] suggest that carbocations are just weakly coordinated to counterion or solvent. For trialkyl and triaryl substituted c a r b e n i u m ions g e n e r a t e d u n d e r superacid conditions, this has been demonstrated in a series of X-ray diffraction experiments in particular carried out by Laube and co-workers. [98-102] For example, for the 3,5,7-trimethyl-1adamantyl cation and its counterion Sb2Fll-, the closest C-F distance was measured to be 2.88 A suggesting that interactions between cation and anion are rather weak. [98] In an interesting study on the rotational reorientation of the 1-adamantyl cation in different superacid solutions, Kelly and Leslie confirmed that interactions between carbocation and the surrounding m e d i u m are also weak.
255 [103] An a n i s o t r o p y of the rotational reorientation m e a s u r e d by NMR spectroscopy was found in SO2C1F solution indicating external stabilization of C + whereas in SO2 no such anisotropy was observed. The anisotropy in SO2C1F solution was explained by weak electrostatic interactions between C + and a diffuse anionic charge cloud rather than specific cation-solvent interactions. [103] It was shown that the major difference between carbenium and silylium ions in solution arises from the fact that internal stabilization of R3C + ions can be much larger than for R3Si + ions. [6] For example, three methyl groups stabilize a carbenium ion by 72 k c a l / m o l (9, Table 3) because of hyperconjugative interactions. In this way, the LUMO is raised in energy and, accordingly, it is less available for a nucleophilic interaction partner. Hence, increasing internal stabilization of a carbenium ion makes specific external solvation more difficult. For carbenium ions with none or just one alkyl substituent attached to C +, internal stabilization is lowered and strong coordination of solvent molecules can take place. For example, when CH3F is dissolved in SbFs/SO2, the solvent is methylated so that instead of CH3 + the complex H3COSO + is formed. [104,105] This occurs also for ethyl and iso-propyl fluoride while use of SO2F2 instead of SO2 prevents dissociation of CH3F or other monoalkyl fluorides. [105] Obviously, there seems to be a competition between solvent and anion to coordinate to the carbocation.
Figure 5. MP2/6-31G(d) geometries of Me3 C+ (9) and Me3C(OH2)2 + (12). Bond lengths in A and angles in degree. Experimentally observed carbenium ions normally have more than one alkyl or aryl substituent, which internally stabilize the ion and prevent strong interactions with solvent molecules. This is confirmed by a comparison of
256 m e a s u r e d 13C N M R chemical shifts of carbocations in solution and the corresponding calculated gas phase values. For example, the Me3C + cation (9, Table 3) possesses a measured ~13C+ NMR shift value in SO2 solution of 335 ppm [87] while the calculated shift value for Me3 C+ in the gas phase is 341 ppm. [106] Clearly, these values suggest that interactions between solvent (counterion) and cation are rather small. If one models the interactions b e t w e e n carbenium ion 9 and solvent molecules of considerable nucleophilicity such as water with the help of the complex Me3C(OH2)2 + (12, Cs-symmetry imposed, Figure 5), then the calculated geometry, binding energy, and NMR chemical shift values of 9 and 12 will provide a basis for comparing the effect of internal stabilization with external interactions of cation 9. There is almost no change in the calculated geometry caused by the interactions with the two water molecules. The C+,O distance is typical of a van der Waals distance where one has to consider the fact that C + should have a shorter van der Waals radius than C itself (1.85 ~,). [107] By using the ratio between the covalent radii for C (0.72 A) and C + (0.69 A) as a correction factor for the van der Waals radius of C, a van der Waals radius of 1.66 A can be estimated for C + and a van der Waals distance of 2.86 A between C + and O. [107] Since the calculated C+,O distance in 12 (2.776 A, Figure 5) suggests just a 3% decrease of the estimated C+,O van der Waals distance, interactions between cation and the water molecules seem to be rather small. This is in line with the calculated IGLO 13C shift for C + in 12, which is just 2 p p m smaller than that of the naked Me3C + cation. [106] Obviously, solvent interactions do not play an important role for carbenium ions once R3C + is hyperconjugatively stabilized by three alkyl groups.
5. SOLVATION OF NEUTRAL SILYL C O M P O U N D S R3SiX AND SILYLIUM IONS R3Si + W h e n a silylium cation is generated from a neutral silyl c o m p o u n d in solution, then both reactant and product will be solvated to some extent, i.e. solvation of a silylium ion does not necessarily take place as a consequence of its generation from a neutral silyl compound in solution. Often solvation facilitates dissociation of a neutral silyl compound into silylium cation and counterion. In this case, part of the solvation shell may be carried along with the developing silylium ion in form of a solvent complex R3Si(S)n + that possesses completely different properties than a silylium ion. One has to check in such a case whether it is still justified to speak of dissociation into silylium ion and counterion. To clarify whether solvation takes place before, during or after dissociation, the solvation of neutral as well as cationic silyl compounds was investigated by Olsson, Ottosson and Cremer. [41]
257 5.1. Neutral Si-compounds in solution In Figure 6, some calculational results for neutral silyl compounds R3SiX and (S)R3SiX 13- 18 are shown. [41] Silylperchlorates are of particular interest since Lambert and co-workers investigated in their early studies a possible dissociation
CI H H.
/c
H
/H
1 888 (i .867)
s~..
. si
2.068 (2.048)
HH
13, C3v
14, C3v
829Si =-66.5 (-65.2) ppm
82gsi = -39.3 (-36.1) ppm
H I 1.737 ~ H ....,,, Si ~
/
O o
H
H3C H3C....,i ~ 1.770
Si~
4
0
H3C
15, C1
HH
O
~
(~l,,,( -o
16, 01
O
829Si =-27.7 ppm
829Si = 43.6 (43.4) ppm
~(H-Si-O) = 105.9
oc(C-Si-O) = 105.1
L/
H 2.780
\
Si
2.104
H
17, C3
CI
H
H H, ~ 1.822 ...~Si~~
N--" .o,-~ 2.317 HH
H
~
/
O
~'O
O
18, 01
829Si =-58.5 ppm
82gsi = -93.8 ppm
~(H-Si-Cl) = 104.2
~(H-Si-O) = 97.0
Figure 6. Selected geometrical parameters and NMR chemical shifts of neutral R3SiX compounds and neutral pentacoordinated complexes (S)R3SiX calculated at the HF/6-31G(d) and IGLO-HF/[7s6p2d/5s4pld/3slp]//HF/6-31G(d) level of theory. Bond lengths in .~, angles in deg, NMR chemical shifts relative to TMS. Values in parentheses refer to experimental results. [41]
258 of these compounds at low concentrations in solvents such as sulfolane. [80-84] It was found by ab initio calculations that for silylperchlorates the Si-O bonds are longer than normal Si-O bonds by about 0.1 A [41]. Because of the high group electronegativity of OC103 compared to that of groups like C1, OH, and CN, [108] the perchlorate group easily accepts the negative charge that becomes available upon heterolytic Si-O dissociation. The long Si-O bonds indicate that R3SiOC103 is prone to dissociate and, therefore, perchlorates should be suitable precursors for generating silylium ions in solution. The long Si-O bonds and the high group electronegativity of OC103 should also imply that a considerable positive charge resides at the Si atom in the neutral compound. It is likely that a nucleophilic solvent molecule S coordinates to Si in R3SiX so that a neutral pentacoordinated complex (S)R3SiX is formed. Complexes 17 and 18 of Figure 6 represent appropriate models for this situation. A neutral (S)R3SiX molecule can dissociate into a tetracoordinated R3Si(S) + complex and anion X- provided the basicity of S is larger than that of X-. According to calculations carried out by Olsson, Ottosson, and Cremer, [41] silyl perchlorates form interaction complexes with nucleophilic partners such as NH3 where this molecule is used to model a nucleophilic solvent molecule S. As shown in Figure 6, S preferentially coordinates at the Si atom opposite to the potential leaving group X. The better the leaving group ability of X is the stronger the coordination between Si and S is as can be seen when comparing 17 and 18 in Figure 6. The 29Si NMR chemical shifts of the (S)R3SiX complexes are more upfield by 20 - 50 ppm than in R3SiX and, accordingly, it should be possible to distinguish (S)R3SiX complexes from R,3SiX compounds by NMR spectroscopy. Analysis of calculated Mulliken charges reveals that the negative charge donated by the molecule S to the silyl molecule R3SiX in complex (S)R3SiX does not stay at the Si atom. Instead, it is passed on to X, which gets enough negative charge to dissociate as anion X-. Hence, upfield shifted 829Si values in solvation complexes (S)R3SiX compared to R3SiX primarily result from shielding of the Si atom due to a transfer of negative charge from S to the silyl molecule. In addition, changes in geometry caused by S association may lead to an upfield shift of the 29Si NMR signal. One can conclude that the formation of complexes (S)R3SiX is a prerequisite for dissociation of R3SiX in nucleophilic solvents S. In such media, free silylium ions are not formed since dissociation leads to cationic complexes R3Si(S)+, which have little or no silylium ion character. [41]
5.2. Specific complexation of R3Si + by nucleophilic solvent molecules For a better understanding of the nature of solvated silylium ions in nucleophilic solvents the properties of R3Si(S) + complexes were investigated for solvent prototypes S such as H20, NH3 and HC1. [41] The possible formation of both tetra- and pentacoordinated complexes was considered. The two S molecules in pentacoordinated complexes were held at equal distances from the Si-atom in order to describe a situation in which the S molecules are part of a solvent shell
259 and, therefore, interact similarly. A short s u m m a r y of the most important data is given in Table 4. [41] The donicity of the nucleophilic solvent molecule S and the electrophilicity of R3Si + determine the strength of the coordination complex. For H3Si + (1), the calculated c o o r d i n a t i o n energies are in the range 20 - 80 k c a l / m o l for tetracoordination and 30 - 110 k c a l / m o l for pentacoordination. Because of the reduced electrophilicity of Me3Si + (4) as a result of hyperconjugative interactions between Si + and three methyl groups, solvent coordination of 4 is much weaker as reflected by coordination energies of 10 - 70 kcal/mol. The covalent nature of Si-S interactions of the complexes listed in Table 4 [41] is suggested by the bond electron density analysis carried out according to criteria given by Cremer and Kraka. [109] Table 4 Some calculated values for R3Si(S)n + complexes (n = 1 or 2). a Complex
Sym
RSi-S
AE
~29Si
charge transfer
19, H3Si(OH2) +
1.859
57.7
13.4
0.251
2.027
83.3
-69.3
0.300
1.917
77.8
-28.7
0.368
2.073
109.1
-127.7
0.457
2.339
22.6
26.7
0.237
2.616
29.7
-13.1
0.253
1.910
40.6
99.0
0.201
26, Me3Si(OH2)2 +
Cs C2v C3v D3h Cs C2v Cs Cs
2.176
52.5
58.2
0.203
27, Me3Si(NH3) +
C3v
1.957
56.6
52.8
0.317
28, Me3Si(NH3)2 +
C3h Cs
2.149
72.0
-43.6
0.336
2.545
9.3
183.5
0.171
20, H3Si(OH2)2 + 21, H3Si(NH3) + 22, H3Si(NH3)2 + 23, H3Si(C1H) + 24, H3Si(C1H)2 + 25, Me3Si(OH2) +
29, Me3Si(C1H) +
a From Ref. 41. Values calculated at the HF/6-31G(d) and IGLO/[7s6p2d/5s4pld/3slp]//HF/631G(d) level. Bond lengths in ~, coordination energies AE in kcal/mol, 29Si shifts in ppm relative to TMS, and transfer of positive charge from R3Si+ to S in electrons derived from the Mulliken charge distribution. IGLO 29Si NMR shifts indicate that most of the complexes listed in Table 4 have only little silylium ion character. The shifts are in the range -130 to 30 p p m for complexes formed with ion 1, and in the range -45 to 180 p p m for complexes formed with ion 4. Hence, they are far from the ideal values of 321 (1, Table 1) and 381 p p m (4, Table 1) expected for free silylium ions. The complex formed with HC1 seems to contain some residual silylium ion character (53%) according to formula (3) of Section 4.1 although it is no longer justified to speak in this case of a free silyliurn ion in solution.
260 Within each series of complexes, the 829Si value correlates with the transfer of positive charge from the silylium ion to the S molecule(s). Since the empty 3p~(Si) AO in I is much more prone to negative charge, it accepts more negative charge from a nucleophilic S molecule (see Table 4), its Si nucleus is more shielded, and, as a consequence, its 29S NMR chemical signal is more upfield shifted than in the corresponding S complex of 4 (Table 4). When comparing with the corresponding carbenium ions, it becomes obvious that S donates considerably more charge to the silylium than to the carbenium ions. For example, charge transfer is roughly four times as large in Me3Si(OH2)2 + (26, Table 4) than in Me3C(OH2)2 + (12, Figure 5). Even for the weakly coordinating HC1, which models chlorinated hydrocarbon solvents such as CH2C12, charge transfer to a silylium cation such as 4 is still considerable (29, Table 4) while it is negligible for the corresponding carbenium ion. [41] By a stepwise build-up of the solvent shell around cation I it was found that the first solvation-shell is composed of maximally 10-12 solvent molecules. [41] In the relatively strongly nucleophilic solvent H20, the coordination energy converges to a value of -100 kcal/mol and the ~29Si value toward -45 ppm. [41] However, when the weaker solvent prototype HC1 was used then a different situation occurs. Keeping HC1 molecules at van der Waals distances from the Siatom, thus simulating what occurs in the limit of very weakly coordinating solvents, the coordination energy asymptotically reached a value of 30 kcal/mol and the 29Si NMR chemical shift converged toward 170 ppm. [41] Even though the coordination energy of complexes H3Si(C1H)n + is low, the 29Si NMR chemical shift clearly indicates considerable changes in the silylium ion character. Hence, there will be little chance of generating silylium cations in solution, even if weakly donating solvents are used. This goal can only be realized if the silylium cation is protected in some way against attacks by either solvent or counterion molecules. Also, solvents and counterions with very weak nucleophilic character have to be designed for the study of free silylium cations in solution. In the next section, we will shortly discuss how the latter prerequisite can be fulfilled.
5.3. Counterions used in research on R3Si + and R3 C+ ions in solution
When searching for free silylium or carbenium ions in solution, counterions have to be used that interact only weakly with the cation. Because of this, counterions have been designed that do not coordinate to cations [110] since their negative charge is largely delocalized and/or their center of charge is protected by bulky substituents. Some of the counterions used or to be used in the research on silylium ions in solution are shown in Scheme 5. Anion Sb2Fll-(30), which is often used in carbenium ion investigations, represents an example for a counterion with delocalized negative charge. [87,97] However, 30 cannot be used in silylium ion chemistry since trialkylsilylhalides do not dissociate into R3Si+X- ion-pairs in superacid media. [18] Corey proposed the perchlorate anion (31) as a suitable counterion [85] and Lambert used it in many of his investigations. [80-84] However, Olah and co-workers showed that
261
I
F ..... F..... S
I
30
F3: :F3 H
--1-
-7-
"FF
F-
H
FsCs...., ~ B ~ F
CF3
F
32
H
H I
33
--1. Br 9
Brv B N , - i ~ Br 34
H I
c
,9.,.. ..,.
B~---':'~/
Br
31
F~F
/ CF3 (F3C)2H306~H ' ~ ~~[:H
(F3C)2H3C6....,~B~
i _
..,,~CI Oo4v ~O
F
II F
F
o
i _
H
C
I
Cl" "~CI~ "CI 35
Me
--7-
CI
I
M
--7-
e., , .~Z,./~"C ,N.N,,,. ~.Me
Me '_,,"", Me M~--~- J, -Me Me" \Me M 36
Me
Scheme 5 the Si-O bond in the X-ray structure of triphenylsilylperchlorate is 1.744 A, which suggests covalent bonding. [19] In addition, Olsson, Ottosson, and Cremer revealed that the measured ~29Si value of trimethylsilylperchlorate in various solvents (27 - 47 ppm) resembles the corresponding value calculated with the IGLO method for the gas phase (44 pom). [41] An analysis of the electron density of the Si-O bond of this molecule sug~:ested that the bond is strongly polar but still covalent in character. [41] More recently, anions such as 32- 36 were designed, synthesized, and some of them used in silylium cation chemistry. [7-16,111,114,115] They have the advantage that their negative charge is both blocked and strongly delocalized and, accordingly, 3 2 - 36 are typical examples of weakly coordinating anions. When investigating solutions of trialkylsilyl cations and carborane anions in aromatic
262 solvents, Reed and co-workers observed coordination of R3Si + to the counterion rather than the solvent. [11-16] Varying the substituent X in X6CB11H6- anions (X = C1, Br or I) it was found that the hexa-chlorinated anion 35 has the weakest interactions with the silylium cation. [16] In the crystal structure of the tri(isop ropyl)silyl cation at the presence of 35, the Si,C1 interaction distance is just 2.323 A [16], which suggests covalent Si-C1 bonding and formation of a halonium ion rather than a silylium ion. Also, a 829Si NMR chemical shift of 115 p p m was m e a s u r e d for the system tri(iso-propyl)silyl c a t i o n / a n i o n 35, which is more downfield by 5 and 18 p p m than for the hexa-bromo- (34) and hexa-iodocarborane analogues [16], but nevertheless too much upfield for a silylium cation. Schleyer and co-workers modelled the situation of a silylium cation interacting with a substituted carborane such as 35 in ab initio calculations and also came to the conclusion that interactions approach those of a bonding rather than a van der Waals situation. [6] Lambert and co-workers chose to work with tetrakis(pentafluorophenyl)borate (33, TPFPB-). [7-10] A crystal structure analysis of the triethylsilyl cation in toluene solution with TPFPB- as counterion revealed that the closest cation-anion contact distance is 4.18 A, which is well outside the range of van der Waals distances suggesting that counterion coordination to R3Si + is no longer a problem with anions such as 33. [7] W h e n designing counterions that are even less nucleophilic than those previously utilized by Lambert and Reed, one should avoid atoms with lonepairs that are able to coordinate R3Si +. Therefore, use of counterions that contain O, F, or C1 is not advisable. Accordingly, a carborane anion substituted by alkyl groups is more suited than its halogenated analogues. In this regard, it should be noted that Michl and co-workers recently m a n a g e d to synthesize the dodecamethylcarba-closo-dodecaborate(-) anion (36), [111] which is rather stable and can be dissolved in nonpolar solvents such as carbontetrachloride and toluene. Therefore, it should be an ideal counterion in the search for persistent silylium ions in condensed phases.
6. STRUCTURE DETERMINATION OF SILYL CATIONS IN SOLUTION As pointed out in Section 2, the N M R / a b initio/IGLO method represents a useful approach to get information about what occurs w h e n silyl compounds R2HSiX and R3SiX are dissolved in various solvents. This is interesting not only for the silylium ion debate, but also because R2HSiX and R3SiX are commonly used reagents in organic synthesis. [112] An understanding of what occurs with these compounds in solution is essential to chemistry in general and, therefore, several experimental investigations on solvated silyl compounds were carried out in the past. [113-116] However, in none of these studies any detailed structural investigation was obtained. The N M R / a b initio/IGLO method can provide this information provided reliable NMR chemical shifts are available from experiment. In the following, we
263 describe a joint enterprise between NMR spectroscopists and quantum chemists to apply this method in the case of silyl cations in nucleophilic solvents. [43]
6.1. Structure determination by the NMR/ab initio/IGLO method Arshadi and co-workers [43] investigated sixty different R3SiX/S and R2HSiX/S systems where solvents S of both relatively weak donicity (methylenechloride, sulfolane, acetonitrile, etc.) and relatively strong donicity (dimethylsulfoxide (DMSO), N-methylimidazole (NMI), etc.) were considered. Specific solvation and dissociation of R3SiX (or R2HSiX) was assumed to take place in three steps involving equilibria between tetra- or pentacoordinated Si compounds I - IV as shown in Scheme 6. X~i-
la§
X
l
R:I~SiN R--
+nS
_.
.-
l~ R
+(n-l)S
I
II
R R~,Si/R
Ill
s
"-'i +
X- + ( n - 1 ) S
-~
"-
I
R.....'"8i
--3 + R
X- + (n-2) S
R," I
IV
S
S
Scheme 6 Comparison of experimental and theoretical NMR chemical shifts suggested the existence of complexes III a n d / o r IV in solution. [43] It was found that formation of these complexes in solution is influenced by three major factors, namely 1) the steric bulk of solvent S as well as that of substituents R attached to the positive Si atom, 2) hyperconjugative and conjugative stabilization of Si + by substituents R, and 3) the ability (donicity) of the solvent S to coordinate to R3Si+. [43] A satisfactory agreement between experimental and theoretical 29Si shifts could only be obtained if calculations were performed with the PISA solvent model. Geometries of complexes III (or IV) were determined by considering as a leading parameter the Si-S interaction distance. In the PISA calculations, this had to be reduced relative to the corresponding gas phase values to obtain agreement between calculated and measured NMR chemical shift values. When this agreement was reached, the calculated geometry was considered to present a
264
reasonable description of the g e o m e t r y of the silyl cation-solvent complex u n d e r the conditions of the N M R m e a s u r e m e n t s . A listing of i m p o r t a n t p r o p e r t i e s of t e t r a c o o r d i n a t e d c o m p l e x e s 3 7 - 40 f o r m e d b e t w e e n Me3Si + a n d one solvent molecule S as calculated w i t h the N M R / a b i n i t i o / I G L O m e t h o d is given in Table 5. Interaction distances Si-S are shorter by 0.01 - 0.04 A in solution than in the gas phase. This is due to the polarisation of both S and R3Si + w i t h i n the solvent cage indicated by a larger charge transfer in the solvated complex (Table 5), w h i c h is comparable with w h a t was found for the BH3NH3 donor-acceptor complex. [38] Table 5 Some properties of Me3Si(S) + complexes 37 - 40 in gas and solution phases. [43] a Me3Si(S) +
37
38
39
40
Solvent S
NCCH3
sulfolane
pyridine
DMSO
Sym r[Si-S; s]
C3v
C1
Cs
Cs
1.862
1.794
1.894
1.747
Ar[Si-S] b
0.040
0.025
0.017
0.026
AE(s) c
39.5
41.4
52.6
58.9
AE(g) c
50.5
59.2
72.2
75.5
~29Si(s)
35.7
58.3
(45.1) d
45.0
629Si(g)
52.2
65.2
48.3
53.4
28.6 - 38.5
(58.4) e
40.8 - 42.6
42.7 - 42.8
Aq(s) f
529Si(exp)
0.320
0.303
0.346
0.354
Aq(g)
0.270
0.283
0.313
0.325
a The index s refers to the solution phase geometry calculated at the PISA-HF/6-31G(d) level and NMR shifts calculated at the PISA-IGLO/[7s6p2d/5s4pld/3slp]//PISA-HF/6-31G(d) level, whereas the index g refers to the gas phase geometry at HF/6-31G(d) and NMR shifts at IGLO/[7s6p2d/5s4pld/3slp]//HF/6-31G(d). Bond distances are given in ~, energies in kcal/mol, 29Si NMR chemical shifts in ppm relative to TMS, and charge transfer in electron. b Ar[Si-S] = r[Si-S; g] - r[Si-S; s]. c Coordination energy calculated from the reaction Me3Si(S)+ ~ Me3Si+ + S. d 29Si NMR shifts calculated at IGLO/[7s6p2d/5s4pld/3slp]//PISA-HF/6-31G(d). e Experimental 29Si shift from Et3Si(sulfolane) +. f z~q gives the transfer of positive charge from the Me3Si group to S in the Me3Si(S)+ complex calculated from the Mulliken charge distribution. N o t e w o r t h y , the distances b e t w e e n Si + a n d the c o o r d i n a t i n g a t o m of S are m e r e l y 5 - 10% longer than covalent b o n d distances in neutral R3SiX c o m p o u n d s . [117] Accordingly, the silylium ion character of these complexes s h o u l d be low. This w a s c o n f i r m e d by an analysis of the b o n d electron density, w h i c h reveals that covalent Si-S bonds are formed in all complexes. [43] The 29Si N M R shifts are
265 upfield by approximately 300 ppm, and the percentual silylium ion character as calculated with the help of equation (3) is merely 14 - 20 %. The complex binding energies (40 - 60 kcal/mol) are smaller by 10 - 20 kcal/mol in solution than in the gas phase. This can be understood when considering that positive charge is more localized in Me3Si + than in the complex Me3Si(S) +. Accordingly, the (nonspecific) solvation energy is larger in the former than the latter ion, which reduces the binding energy of the solvent-ion complex formed in the specific solvation process. From the data in Table 5, one can conclude that an increase in binding energy is parallel to an increase in solvent donicity provided that steric factors do not hinder the contact between Si + and S. For example, the latter effect was found in the case of pyridine as solvent. In pyridine, the complexation energy is lower than in the slightly weaker donor DMSO, which is due to large steric interactions between substituents R and the ~-H atoms of pyridine. [43] The investigation of Arshadi and co-workers [43] suggests that formation of a pentacoordinated complex of type IV does not occur in the case of the Me3Si + cation, which is in line with similar observations by Kira and co-workers made for the complex Me3Si(OEt2) +. [114a] Most likely this is due to steric factors, which also play a role in the pentacoordination of R2HSi + cations. For example, pentacoordinated complexes could not be detected when solving Et2HSiOC103 in the strongly donating hexamethylphosphortriamide (HMPA). [43] However, in the less b u l k y b u t m u c h w e a k e r n u c l e o p h i l i c s o l v e n t acetonitrile, pentacoordination was detected. Hence, steric effects seem to be important in connection with the formation of type IV complexes. According to PISA-HF calculations, complexes Me2HSi(S)2 + are not much more stable than the analogous tetracoordinated Me2HSi(S) + complexes. If internal hyperconjugative stabilization of the silylium ion is lowered as in cations MeH2Si +, then the need for external stabilization by S c o m p l e x a t i o n will be increased and pentacoordination will occur easily. In several cases, crystallographic data are available that confirm the existence of tetra- or even pentacoordinated silyl cationic complexes. [11,118-120] Changes in the geometry of complexes R3Si(S) + due to an increase of the dielectric constant of the solvent S are most important in an interval E = 0 - 10 as shown in Figure 7 for Me3Si(NCCH3) + (37). Accordingly, changes in the gas phase geometry can be anticipated even if solvation occurs in solvents with rather low polarity (e.g. CH2C12). Geometrical changes should be particularly large when the Si-S bond as the leading parameter of the complex is "soft", i.e. when energy changes upon a change in this parameter are small. In s u m m a r y , t r i a l k y l s i l y l i u m ions R3Si + are stabilized by forming tetracoordinated Si complexes of type III. Pentacoordination at Si will only occur if there are just weak steric interactions between substituents R and solvent S. Mostly, this is parallel to a low internal stability of the silylium ion in question. Accordingly, R2HSi(S)2 + complexes can be observed experimentally while this is not possible for trialkyl silyl cations. [43]
266
Si_N [A]
1.91 1.9
--]+
Me
\
1.89
M,~rS i - - - - - N ~ c ---'CH3
1.88 1.87 1.86
I
0
s
10
I
20
I
30
I
40
I
50
I
60
,
70
80
Dielectric constant E Figure 7. The Si-N bond length in Me3Si(NCCH3) + (37) in dependence of the dielectric constant E of the solvent. Calculated at the PISA-HF/6-31G(d) level. [47]
7. INTRAMOLECULAR SOLVATION OF SILYLIUM IONS A solvated molecule is rather difficult to describe since there can be specific interactions with one or more solvent molecules apart from nonspecific interactions with one or more solvent shells. Although the investigation of specific solvent complexes provides a description of major solvent effects it is far from giving an exact picture of a solvated molecule. This can only be obtained by excessive solvent modelling using Monte Carlo methods which becomes too costly w h e n carried out for a larger number of silylium cations in different solvents. Alternatively, one can create a situation, in which primary solvation of the soluted molecule occurs in a well-defined way. For this purpose, the idea of "internal solvation" was developed, [44] which is based on the following: If the solute and solvent molecules are part of the same molecular system, then the molecular link between them will exactly define magnitude and type of solutesolvent interactions. Realization of the idea of internal solvation in the case of silylium ions can be done with the help of a template to which both the group R2Si + and a nucleophilic "solvent" -CH2-Z are bonded. For example, the b i d e n t a t e d molecular templates used by Corriu and co-workers to study intramolecular coordination, [121-124] are suitable to investigate a well-defined internal solvation situation. The general structure of these templates, originally
267 developed by van Koten and co-workers, [125,126] is given by 41. Corriu and coworkers [121-124] investigated cations 41, for which Z is chosen to be an amino group. This leads to relatively strong Si,Z interactions and a tetra- or pentacoordinated Si atom. By varying Z different solvent situations can be modelled, which makes it possible to predict a situation where silylium ion character is largely retained.
/---z
I 41
7.1. Strong intramolecular solvation of silyl cations To maximize stabilizing interactions between the empty 3p~(Si) orbital and the electron lone pairs at the N atoms of the two substituents -CH2-NR2, the R2Si group in 41 adopts a conformation, in which it is perpendicular to the plane of
1.374
.-~+
1.385
1.394 1.4o9 ~0~11.77S1~ ~.7o ~J 118.2
118.7
1.381 1.387 1.393 |.+ /( ' ,~ 11.8204 01 ~ H ) 1 ~ 4 2 1-3~z\M../,4_81%i
113.5
416 9 '~1 1.427 1.456 7a, C2v
1.473
1.377
1.390
1.401
"'1 +
1.388
~ 0 ~ - - 1.798 OH3 1.803S/ H~117.0 .
42a, C2
:;o
1.853 1.852
1.414 7b,
C2v
1.382
1.479
I+
1.387 1.393
1.39~r~1.834
1.459
.830
=,,,,oo 1.412 42b, 02
Si~,,.CH3) 119.4 118.9 1.848 1.846
Figure 8. HF/6-31G(d) (normal print) and B3LYP/6-31G(d) (italics) geometries of PhH2Si + (7) and PhMe2Si + (42) in their planar (a) and perpendicular (b) forms. Bond lengths in ]k and angles in degree.
268
the p h e n y l ring. In this form, conjugation b e t w e e n the 3p~(Si +) orbital and the ~orbitals of the p h e n y l ring is lost, w h i c h m e a n s a considerable loss of energy. For the m o d e l systems PhH2Si + (7) and PhMe2Si + (42), rotation by 90 ~ requires 15 - 17 a n d about 11 k c a l / m o l as revealed by the energies listed in Table 6. The loss of conjugation is clearly reflected by the geometries of the orthogonal forms 7b and 42b s h o w n in Figure 8. The SiC b o n d s are elongated and the CC b o n d s in the p h e n y l ring are equalized c o m p a r e d to the equilibrium structures 7a a n d 42a. It i n t e r e s t i n g to note that the calculated IGLO 29Si shift values of the R2PhSi + cations w i t h p h e n y l rings p e r p e n d i c u l a r to the R2Si g r o u p are similar to those of the p a r e n t R2HSi + cations (HF-IGLO values of 1: 270; 3 : 3 3 4 p p m , Table 6; DFTIGLO values (321 and 365 p p m , Table 1) are more reliable, however, these are not available for the other molecules listed in Table 6; in any case trends will be Table 6 Calculated properties of silyl cations with and w i t h o u t internal coordination, a Molecule
Sym
Relative Energies HF
~29Si
q(Si +)
B3LYP
7a, PhH2Si +
C2v
0.0
0
200.3
0.826
7b, PhH2Si +
C2v
15.1
17.3
265.6
0.945
42a, PhMe2Si +
C2
0.0
0
268.0
1.050
42b, PhMe2Si +
C2
10.6
11.0
320.6
1.098
1, H3Si +
D3h
270.2
0.932
3, Me2HSi +
C2v
334.1
1.076
43a, H2SiPh(CH2NH2)2 +
C2
0.0
0
43b, H2SiPh(CH2NH2)2 +
C2
86.5
85.1
44, H2SiPh(CH2NMe2)2 +
C2
-80.5 b
0.948 0.781
-43.1 c
0.989
45b, Me2SiPh(CH2NMe2)2 +
C2
0.0
0.0
1.301
45c, Me2SiPh(CH2NMe2)2 +
C1
7.5
10.4 (6.9 d)
1.173
46a, H2SiPh(t-Bu)2 +
C2
0.0
0.0
46b, H2SiPh(t-Bu)2 +
C2v
18.0
19.1
47, Me2SiPh(t-Bu)2 +
C2
25.1
0.855 0.839
146.2
0.872
48, H2SiAnt(Me)2 +
C2
63.9
1.105
49, Me2SiAnt(Me)2 +
C2
186.7
1.083
a Energies in kcal/mol from calculations with the 6-31G(d) basis. NMR chemical shifts in ppm relative to TMS, charges in e. Ph: phenyl, Ant: anthryl. 829Si values calculated at HFIGLO/[7s6p2d/5s4pld/3slp]//HF/6-31G(d) level, and q(Si+) Mulliken charges at HF/6-31G(d) level, b 829Si value calculated at IGLO/6-31G(d)//HF/6-31G(d) level, c Experimental 29Si shift value is -46.4 ppm. d Energy difference calculated at the SCIPCM-B3LYP/6-31G(d) level for the dielectric constant r = 32.4.
269 reproduced consistently by HF-IGLO). This similarity is due to the fact that the conjugative effect of the phenyl ring is eliminated since phenyl and R2Si are orthogonal to each other. Accordingly, there are just hyperconjugative or inductive interactions between silyl group and phenyl ring and it seems as if the phenyl ring of 7b and 42b, respectively, has a similar effect on Si + as an H atom. In line with this is the fact that the Mulliken charges at Si (see Table 6) are comparable within 0.01 0.02 e for 7b and 1 (42b and 3). The barrier for rotation of the R2Si + unit in silyl cations 41 should be similar to that in 7 and 42. However, in those systems studied by Corriu and co-workers, [121-124] destabilization due to rotation is c o m p e n s a t e d by stabilizing intramolecular interactions b e t w e e n Si + and the NR2 ligands. This can be confirmed when comparing the energies of molecular forms with and without internal coordination at Si. For instance, the energy of 43b (Z = NH2 and R2Si + = H2Si +, Scheme 7), for which the bidentated ligands are rotated away from Si +, is 86.5 kcal/mol higher than that of the equilibrium form 43a (B3LYP calculations [44]). Ottosson and Cremer concluded that coordination of the NH2 ligands to Si + in 43a stabilizes the cation by approximately 100 kcal/mol and that the bidentated ligands Z can compete with or even prevent external complexation of R2Si + by solvent molecules S. [44]
NH2"7 +
-7+ /
H
8i
\ H
,,,,N H2
H2N,#"
43a, C2
43b, C2
Scheme 7 Silyl cationic species of type 41 with Z = NR2 resemble silyl cations in Ncontaining solvents according to similarities of calculated properties for 41 and complexes R3Si(NR'3)n +. [40,44] Available results from crystallographic studies of silylium cations coordinated to solvents such as N-methylimidazole or pyridine also confirm this observation. [118-120] In the case of cation 41 with Z = NR2 it is justified to speak of an internally solvated silylium ion closely related to a solvated ion R3Si + in amine solution. The conjugative effect of the phenyl ring is eliminated in the orthogonal geometry so that the phenyl (aryl) substituent resembles an alkyl substituent thus making the comparison of cations 41 and alkyl substituted silyl cations in solution meaningful. When R2Si = H2Si and Z = NMe2, cation 44 shown in Figure 9 is obtained. For this and similar species, Corriu and co-workers concluded from conductivity
270 measurements that ionic structures are formed in solution w h e n the parent silanes are treated with half an equivalent of iodide. [121] The HF-IGLO 29Si shift of 44 is -44.2 ppm, [44] which is in reasonable agreement with the experimental value of -46.4 ppm. [121] This shift value indicates strong coordination since it is upfield by more than 300 ppm relative to that of 7b (Table 6). According to a bond electron density analysis (Table 7), the Si-atom in 44 is p e n t a c o o r d i n a t e d possessing two covalent Si-N bonds, [44] and the bond lengths resemble those observed in crystallographic studies on pentacoordinated silyl cationic species (2.01 - 2.08 A). [118b,119,120] Thus, 44 is an example of a siliconium ion, i.e. the hypervalent silicon analogue of a carbonium ion.
Figure 9. HF/6-31G(d) (normal print) and B3LYP/6-31G(d) (italics) geometry of cation 44. Bond lengths in A and angles in degree. [44] Willcott [127] proposed that in cations such as 45 a degenerate exchange between two tetracoordinated forms occurs according to process I in Scheme 8. On the basis of low-temperature 1H and 13C NMR measurements, these authors suggested that two equivalent structures 45a and 45a" are involved in a process of rapid interconversion which averages the N M R signals at room temperature. [127] Accordingly, the pentacoordinated structure 45b is the transition state for the rearrangement between 45a and 45a" (process I of Scheme 8). According to NMR measurements the free energy needed for the alleged exchange process is 12.1 kcal/mol in methanol and 10.4 kcal/mol in chloroform solution. [127] Corriu and co-workers [124] pointed out that the results of the NMR measurements depend on the counterion used when generating bidentated cations of type 41. In the case of a strongly interacting counterion (41 with R2Si + = PhMeSi + and Z = NMe2), 13C NMR shifts are in line with an exchange process.
Table 7 Bond Density Analysis. a # 43a 44
45b 46a
47 48 49
Molecule
H2SiPh(CH2NH2)2+ H2SiPh(CH2NMe2)2+ MezSiPh(CH2NMe2)2+ H2SiPh(t-Bu)2+ Me2SiPh(t-Bu)2+ H2SiAntMe2+ Me2SiAntMe2+
Sym
C2 c2 c2
C2 C2 C2 C2
Atoms A,B Si,N Si,N SIN Si,H Si,H Si,H Si,H
distance A,B 2.089 2.113 2.260 1.952 2.040 2.068 2.200
type of critical point (3,-1) (3,-1) (3,-1) (3,-1) (3,-1) (3,-1) (3,-1)
p(rb)
H(*)
Character of
bond 0.38 0.38 0.30 0.21 0.17 0.17 0.13
-0.11 -0.13 -0.11 -0.06 -0.03 -0.03 -0.0 1
weakly covalent weakly covalent weakly covalent weakly covalent electrostatic electrostatic electrostatic
a Distance in A, electron density p(%) in electron A-3, energy density H(rb) in hartree A-3. Each stationary (critical) point q,is characterized by rank and signature (type of critical point). The character of the bond is given according to the criteria of Cremer and Kraka. [lo91 Me: methyl, Ph: phenyl, t-Bu: t-butyl. Calculations at the HF/6-31G(d) level. [44,471
272 However, in the case of more weakly interacting counterion such as TPFPB- the N M R data are in line with the formation of a p e n t a c o o r d i n a t e d siliconium ion. [124] It was concluded that the stability of the pentacoordinated ion strongly depends on solvent and counterions. Specifically, methanol seems to stabilize the tetracoordinated form extensively. [124] Process I CH
/..,
i~cCH33
\
CH
CH3 7 * ~CH3
7 +
:
~,1~ c~3 -7 =_
\'~"J/
~;,~ CH3
/
3
X ...." ~"CH3
.....' Nr OH3 CH3
""* ~ CH3 45a, C 1 CH3
(()~...._
CH 3
45b, C 2
45a', C 1
Process II
d.c% 7* X"
'~
q~,)//X___ ~i~ CH~
x
~_. /Si~ CH ,,-,u3
" .....
) CH3
45b, C 2
+
x
~3
Me 2
45c, C 1
=
+..,~. CH3 ' ~ CH 3
f
......"N,,. OH3 CH3
45c', C1
Scheme 8 Ottosson and Cremer [44] investigated processes I and II with the help of q u a n t u m chemical calculations to assess the stability of the various forms of 45 (compare with Figure 10). The pentacoordinated form 45b was found to be more stable than 45c by 10.4 k c a l / m o l in the gas phase and 6.9 k c a l / m o l in solution using the dielectric constant E = 32.4 of methanol (Table 6, B3LYP/6-31G(d) calculations) while theory does not support the existence of 45a. Hence, process I of Scheme 8 does not seem to be likely for cation 45. The q u a n t u m chemical studies suggest that interactions between the CH2-NH2 ligands and R2Si + in cations 41 resemble those between a N-containing solvent (e.g., an amine) and a R2HSi + cation. In both cases it is possible that a pentacoordinated Si atom is formed. [43,44] 7.2. W e a k i n t r a m o l e c u l a r s o l v a t i o n of s i l y l i u m i o n s
When Z in 41 represents a group with little coordination ability, intermolecular solvation of silylium ions in a weakly nucleophilic solvent can be modelled. [44] For Z = CH3, cations 46 and 47 s h o w n in Figure 11 are obtained, which should
273
Figure 10. HF, B3LYP (italics), and IPCM-B3LYP (italics in parentheses) geometries of cation 45 obtained with 6-31G(d). Bond lengths in A, angles in degree.
274
Figure 11. HF/6-31G(d) (normal print) and B3LYP/6-31G(d) (italics) geometries of 46a and 47. Bond lengths in A, and angles in degree.
275 model silylium ion-solvent interactions in an alkane solvent. Because of the low nucleophilicity of alkanes, one can expect that these interactions are negligible. Ab initio calculations reveal that methyl groups interact significantly with the positive Si atom despite their weakly nucleophilic character. This is indicated by IGLO 29Si shifts of 46 (25 ppm, Table 6) and 47 (146 ppm), which are upfield by 241 and 175 ppm, respectively, compared to 29Si shifts calculated for reference compounds 7b and 42b (266 and 321 ppm). [44] Considering that even noble gases argon and neon interact with Si + of a trialkylsilylium cation, [6] this result becomes understandable. Theoretical investigations clearly suggest that free silylium cations R3Si+ can only be generated in liquid helium. [6,44] In 46, the Si + interacts with two H atoms of the adjacent methyl groups at distances of 1.925 A. [44] This interaction is also revealed by a lengthening of the C-H bonds of the interacting H atoms. An analysis of the electron density distribution in the Si--H region suggests that weak covalent bonds are formed (Table 7). Si,H interactions in 46 resemble those in the complex H3Si(CH4)2 +, for which a coordination energy of 26 kcal/mol (B3LYP calculations) and an Si,H interaction distance of 1.98 A was calculated. [44] Comparable results were also found by Apeloig and co-workers [128] who carried out a computational study on H3Si(CH4) +. Schaefer and co-workers [129] investigated Sill7 + and found that the cation can be regarded as a H3Si + ion interacting with two H2 molecules. This was confirmed by Hiraoka and co-workers [130] who studied gas phase clustering reactions of Sill3 + and H2 using a pulsed electron beam mass spectrometer. The energy for dissociation of Sill7 + into Sill5 + and H2 was measured to be 4.8 kcal/mol, and that for dissociation of Sill5 + into Sill3 + and H2 14.2 kcal/mol [130] both dissociation energies being in agreement with calculated values. [129] These observations clearly show that weakly coordinating solvents such as alkanes are able to coordinate to silylium ions. However, dimethyl substitution at the Si atom such as in 47 increases the internal stability of the cation by hyperconjugation and, as a consequence, S!,H interactions are weakened. In cation 47, the Si,H interaction distance is 1.98 A (Figure 11), which is 0.06 A longer than the corresponding distance in 46 (Figure 11). Nevertheless, the calculated 29Si NMR shift (146 ppm, Table 6) is still 175 ppm upfield from that of reference 42b (321 ppm, Table 6) indicating that considerable silylium ion character is lost. Ottosson and Cremer [44] estimated that the calculated shift value corresponds to 48% of remaining silylium ion character. Coordination of methyl groups to the positively charged Si atom can be further reduced by using more rigid ring systems as templates. In this regard, the anthryl group was suggested by Ottosson and Cremer and tested as a template for internal solvation of a silylium ion by these authors. [44] By placing the R2Si + group at C9 (C1 in Figure 12) and methyl groups at C1 (C3 in Figure 12) and C8 of the anthryl system cations 48 and 49 shown in Figure 12 are obtained. In these cations, the methyl groups are positioned in such a way that interactions with Si + become more difficult than in 46 and 47. As a consequence, Si,H interaction
276
Figure 12. HF/6-31G(d) (normal print) and B3LYP/6-31G(d) (italics) geometries of cations 48 and 49. Bond lengths in A, and angles in degree. distances are elongated to 1.98 and 2.11 A (B3LYP results, Figure 12), and calculated 29Si shifts are 64 and 187 ppm, i.e. about 200 and 130 ppm more upfield than the corresponding reference values of 7b and 42b (Table 6). Analysis of the electron density distribution revealed that Si,H interactions are weaker than in 46 and 47. [44] For 49, they are typical of electrostatic interactions rather than weak covalent bonding. From formula (3), an estimated silylium ion character of 60% was obtained, which suggests that this cation possesses the largest silylium cation character of all intramolecularly solvated silyl cations. [44]
277 8. APPROACHING A NEARLY FREE SILYLIUM ION IN SOLUTION
In view of the discussion in the previous sections, the goal of generating a nearly free silylium ion in solution can only be realized if the following criteria are fulfilled: (1) The silylium ion has to be internally stabilized by either inductive or hyperconjugative effects. (2) Steric blocking of the Si + center by bulky substituents R have to hinder coordination by solvent molecules S or counterions X-. (3) The coordination ability of S has to be minimized by using weakly nucleophilic solvents. (4) The solvent molecule has to be sterically demanding to make the attack of a (sterically also shielded) silylium cation more difficult. (5) Counterions with a delocalized negative charge should be used to reduce their coordination ability. (6) The bulk of the counterion will further reduce its coordination ability. In the following, we will discuss the possibility of generating nearly free silylium ions under the conditions of external solvation where special emphasis is laid on stepwise introducing conditions (1) - (6) to see which effects are most important for the realization of a free silylium cation in solution.
8.1. Trialkylsilylium ions in aromatic solvents In the way it became clear that at least one or two of the criteria (1) - (6) had to be fulfilled to expect any success in the quest for free silylium cations in solution, one systematically started to use solvents with weak nucleophilic character as well as weakly coordinating counterions such as 33 or 35 shown in Scheme 5. The lead in this research was taken by Lambert, but Reed made also several important contributions and, therefore, one can say that both Lambert and Reed pushed forward the issue of silylium cations in solution despite scepticism and criticism on the usefulness of such work for general chemistry. Benzene and toluene were chosen as the ideal solvents and in 1993 Lambert and Zhang reported on trialkyl substituted silyl cations in aromatic solvents at the presence of TPFPB as a counterion. [7] From measured 29Si NMR chemical shifts (80- 110 ppm), Lambert and Zhang concluded that they had obtained R3Si+ (R = Me, Et, Pr, Me3Si) ions with reduced electrophilic interactions with solvent or anion and, therefore, with nearly free cationic structure. This description seemed to be confirmed just months later w h e n Lambert and co-workers succeeded in crystallizing Et3Si+TPFPB - in toluene. [8] The X-ray diffraction investigation showed that counterion TPFPB-is well separated from Si + (closest contact distance: 4.18 A) thus excluding any coordination between silylium cation and the anion. However, the structure determination also revealed that the unit cell contains eight toluene molecules, one of which has a contact distance of 2.18 A to the silylium cation. [8] Lambert and co-workers
278 regarded this distance too long for covalent Si-C bonding and not indicative of toluene coordination. As a proof for this interpretation the authors noted that the C-C bonds in the toluene solvent molecule with the closest contact deviate only slightly from their equilibrium values thus suggesting only weak interactions b e t w e e n Et3Si + cation and toluene. The fact that the m e a s u r e d 29Si NMR chemical shifts for Et3Si + and related R3Si + cations in aromatic solvents are about 300 p p m upfield from calculated gas phase values of silylium cations (see Table 1) was considered to be a result of enforced pyramidalization of the R3Si + cation rather than a result of solvent coordination leading to (some) covalent Si-C bonding. [7-9] Pyramidalization of the R3Si unit, in turn, was interpreted as a direct consequence of steric a n d / o r environmental effects acting in the crystal. Lambert's result seemed to find further support by parallel work published by Reed and co-workers, [12-16] who also managed in some cases to prepare crystals from silylium cation solutions in aromatic solvents. The fact that Lambert's crystal structure analysis of Et3Si+TPFPB - was published in Science where it attracted much attention explains w h y within the relative short time of just a couple of months, work from three i n d e p e n d e n t groups a p p e a r e d that heavily criticized Lambert's interpretation on the nature of silylium cations in aromatic solvents. At the same time Pauling [27] made the comment on Lambert's work, which is already mentioned in section 1.2. Pauling concluded that in Et3Si(toluene)+TPFPB - the nearest C atom of toluene and Si are partially b o n d e d according to an estimated bond order of 0.35 and that the positive charge of Si is distributed to an extent of 35% over the C atoms of the toluene molecule. Accordingly, one can not speak of a free or nearly free silylium cation in the case of Et3Si+TPFPB - in toluene solution. Even before Pauling's criticism was published in Science as a technical comment, [27] elaborate ab initio calculations had been carried out by Olsson and Cremer [39], Schleyer, Apeloig, and co-workers [131], and Olah and co-workers [21,22]. These investigations unanimously showed that a R3Si + cation in aromatic solvents immediately forms a Wheland a-complex with a solvent molecule as shown in Scheme 9 for the case of benzene. Hence, one obtains a carbocation rather than a silylium cation in aromatic solvents. +
/
R .....
,,.u.Si~
Rd
R
Scheme 9
279
Figure 13: HF/6-31G(d) geometry of Me3Si(C6H6) + (51). Bond lengths in A and angles in degree. In Table 8, the calculated properties of complexes 5 0 - 57 between silylium cations (1, 4, 58, 59, 60) and weakly nucleophilic solvent molecules ranging from benzene [39,44,46] and toluene [22] (50 - 54) to methane (55) [6,45] and helium (56, 57) are summarized. [6] Figure 13, which is taken from the work of Olsson and Cremer [39], gives the calculated ab initio geometry for the Wheland o-complex Me3Si(C6H6) + (51) formed by reaction of Me3Si § (4) with a benzene molecule. For 4 in benzene solution, Lambert and Zhang measured a 29Si NMR chemical shift of 83.6 ppm [7,9] which is in convincing agreement with the HF-IGLO shift value (83.1 ppm, Table 8) of 51 calculated by Olsson and Cremer. [39] The computed structure of 51 (Figure 13) is similar to the solid state structure of Et3Si(C6H5CH3) § (52). [8] The binding energy of the complex was found to be 23.0 kcal/mol at HF/631G(d), [39] which is much larger than expected for a weak and non-covalently bonded association complex. The analysis of the electron density in the Si-C4 bond (Figure 13) revealed that the interaction is indeed of covalent character. Hence, generation of 4 in benzene solution leads to a carbocation, which according to calculated and measured 29Si NMR chemical shift values possesses less than 30% silylium ion character (see Table 8). Reed and co-workers published X-ray crystal structures of tris(t-butyl)silylium and tris(i-propyl)silylium ions, for which the counterion Br6CB11H6-was found to be located at Si-Br distances of 2.47 and 2.48 A, respectively. [15] Schleyer and coworkers showed with the help of ab initio calculations that the Br atom of the counterion interacts so strongly with the positive Si atom in this situation that little silylium ion character remains. [6] All these results suggested that Lambert and Reed had too little information from their experimental studies to draw
Table 8 Comparison of calculated properties of complexes RsSi(S)+ (S = solvent). a Complex Cation Solvent
50 H$i+ (1) benzene
51 Me3Si+ (4) benzene
52 Et3Si+ (58) toluene
53 (Me3Si)3Sif (59) benzene
54 (9-BBN)3Si+(60) benzene
55 H3Si+ (1) methane
56 H$i+ (1) He
57 Me3Si+ (4) He
49.8
23.0
21.6
9.6 (8.9)
4.4
16.4
<1
<1
R(Si-C1) 829Si[R$i(S)+]
2.078 -23.8
2.213 83.1
2.186 79.3
2.452 (2.293) 205.8 (111)
2.568 (189)
2.391 27.9
2.773 254
3.224 345
629Si[&Si+] 629Si[R3SiH] %[R3Si+]c
270.2 -99.9 20
355.9 -16.6 27
374.0
920.4 -119.6 31
(41)
270.2 -99.9 34
270.2 -99.9 96
347 -16.6 99
Averaged a(R-Si-R)
114.1
114.0
113.5
114.2
114.8
120
120
Largest a(R-Si-R) a(Si-Cl-C4) ARc-c
114.5 105.4 0.063
114.5 107.6 0.046
114.4 111.2 0.056
115.2 110.5 0.033
116.6 114.4 0.023
120
120
Ref.
39
39
22
45
46
6'44
6
6
Method
HF
HF
HF
HF
HF
m
MP2
MP2
AE
a Geometries (R in A, a in deg) a n d binding energies AE (kcal/mol) calculated at the HF/ or MPZ(FC)/6-31G(d) level. 29Si Nh4R chemical C1 and C4 are the ips0 and para C atom of the benzene ring. shifts in ppm relative to TMS calculated at IGLO/[7~6p2d/5~4pld/3slp]. b Since more reliable DFT-IGLO 29Si NMR chemical shift values are available just for 1 (321 ppm, Table 1) and 4 (381 ppm, Table l), HFIGLO values have been used. They reproduce trends in 29Si NMR chemical shifts correctly. C The percentual silylium cation character %[R3Si+] is calculated according to Eq. (3). ~ARc-c: Longest minus shortest C-C bond lengths of the aromatic ring system in R3Si(S)+. e NMR/ab initio/IGLO results from Ref. 45.
281 reliable conclusions on the nature of the silyl cations they had generated in various aromatic solvents. The ab initio work clearly showed that calculations had to be combined with the results of experimental measurements to guarantee a reliable description of R3Si+ ions in solution. Hence, the somewhat provocative publications of Lambert and Reed triggered enhanced activities by the quantum chemists that in the end led to a solution of the problem. Clearly, the experiments of Lambert and Reed fulfilled some of the requirements for getting free silylium cations in solution, namely reduced nucleophilicity of the solvent, (in part) bulk of solvent and counterion molecules, and weak coordination ability of the counterions used. Also, a trialkyl silylium cation is internally stabilized by hyperconjugation. However, the most important requirement for getting a free silylium cation in solution, namely steric blocking of the Si +, was not fulfilled by the cations investigated. Therefore, Cremer and co-workers [44-47] calculated various silylium cations with bulky substituents, which represent interesting targets for experimental investigations.
8.2. Silyl substituted silylium ions in solution Lambert and co-workers generated the tris(trimethylsilyl)silylium ion (Me3Si)3Si + (59) with TPFPB- as counterion in aromatic solvents such as benzene or toluene. [7,9] The trimethylsilyl substituents of cation 59 are sterically demanding and, by this, may prevent an aromatic solvent molecule to coordinate to the silylium ion. Olsson and Cremer calculated that the internal stability of 59 is larger than that of 4 (57.6 vs. 43.2 kcal/mol relative to H3Si + at B3LYP/631G(d)). [45,46] According to a Mulliken population analysis, the charge at the central Si atom in 59 is less positive than in 4 (0.01 vs. 0.89 e) since the silyl groups inductively donate electron density to Si + while the methyl groups are electron withdrawing. This should lower the electrophilic character of trisilyl substituted silylium ions and thus increase their kinetic stability in solution. Hyperconjugative stabilization in 59 is relatively small compared to that of alkylsubstituted silylium ions since Si-Si bonds are longer than Si-C bonds and, accordingly, overlap between 3p~(Si) and pseudo-~(SiMe3) orbitals is reduced. In any case, 59 satisfies the basic criterion of a silylium cation, namely that the LUMO should be dominated by the 3p~(Si +) orbital (see section 4.1). The measured 29Si NMR chemical shift of 59 in benzene solution is 111 ppm, which was considered to indicate strong silylium ion character of 59 in solution. [7,9] This was based on the fact that the parent silane, (Me3Si)3SiH, possesses a shift value (-119.6 ppm, Table 8) which is 230 ppm more upfield than that of 59, which is large in comparison to the corresponding difference in shift values for silylium cations and parent silanes observed in the case of trialkyl substituted systems. However, all these considerations became rather irrelevant when the IGLO 29Si chemical shift of uncoordinated 59 was calculated at 921 ppm, which is more than 800 ppm downfield from the experimental shift value of 111 ppm! [22,45] Ottosson and Cremer [45] calculated that 29Si NMR chemical shift values of cations (H3Si)nH3_nSi + are shifted downfield by approximately 220 ppm per silyl
282 group (compare with Figure 14a). The unusually positive ~29Si values of trisilyl substituted silylium ions are caused by large paramagnetic shift contributions connected with a lowering of excited state energies and indicated by a decrease of the HOMO-LUMO energy gap in dependence on the number of silyl groups attached to the central Si-atom. [45] When comparing silyl and alkyl substituted silylium ions one sees that the HOMO-LUMO gap is much larger for the latter and, accordingly, their ~29Si values are only moderately downfield shifted (Figure 14a) because of paramagnetic effects in view of the large effects calculated for silyl substituted silylium ions. Both hyperconjugative and inductive effects influence the HOMO-LUMO gap. A hyperconjugatively stabilizing R such as an alkyl group raises the energy of the LUMO and, by this, increases the HOMO-LUMO gap (Figure 14b). On the other hand, the electron releasing, inductive effect of a silyl group raises the energy of the HOMO and, by this, decreases the HOMO-LUMO gap. Hence, replacement of alkyl (larger hyperconjugative effect plus electron withdrawing effect) by silyl groups (smaller hyperconjugative effect plus electron releasing effect) decreases the HOMO-LUMO gap, increases paramagnetic contributions to the 29Si NMR chemical shift, and leads to strongly downfield shifted 829Si values (Figure 14a).
Q
Q
"~ 8o0
i
~o0
6oo O.4S
u ~
400
0,4 d~ 30o Sill 3_n(SiH 3)n +
Sill 3_n(CH 3)n +
0,35
100 0.3 o
. I
I Z
N u m b e r n of Sill 3 and CH 3 substituents
0
~ 0
~ !
' Z
3
N u m b e r n of Sill 3 and CH 3 substituents
Figure 14. (a) Dependence of the 529Si NMR chemical shift of the central Si atom in SiH3_n(SiH3)n + and SiH3_n(CH3)n + cations on the number n of Sill3 or CH3 substituents. (b) Dependence of the HOMO-LUMO gap aE = E(LUMO) -E(HOMO) in SiH3_n(SiH3)n + and SiH3_n(CH3)n + cations on the number n of Sill3 or CH3 substituents. [45] Figure 15 shows the calculated geometry of cation 53. Despite the size of the trimethylsilyl groups, one can detect steric openings in the periphery of the ion (indicated by dashed triangles in Figure 15) that make an attack of a mediumseized solvent molecule from above or below the central Si atom of 59 possible.
283
Figure 15. HF/6-31G(d) and B3LYP/6-31G(d) (values in italics) geometry of (Me3Si)3Si + (59). Bond lengths in ,~ and angles in degree. The two inserts indicate the dimensions of the openings above (H1, H2, H3) and below (H4, H5, H6) the Si atom (black atoms). Therefore, it is not surprising that 59 in benzene solution forms the Wheland g-complex (Me3Si)3Si(C6H6) + (53, see Figure 16) according to ab initio calculations. [45] The geometry of 53 was determined by the N M R / a b initio/IGLO method using the PISA solvent model for the dielectric constant of benzene and considering the Si+,S interaction distance as the leading parameter (see section 2) of the complex geometry. In this way, it was found that the experimental 29Si shift of 111 ppm corresponds to a Si-C distance of 2.293 ]k (see Figures 2 and 16), which is not longer than in other Wheland a-complexes (2.08 - 2.29/k, see Table 8). [6,22,39,131] The binding energy hE calculated for this distance is 8.9 kcal/mol.
[451 In 53, a weakly covalent Si-C bond connects ion 59 and benzene thus leading to a tetracoordinated Si atom. According to the calculated 829Si gas phase value of 59 and the measured 829Si shift value of 53, the complex possesses just 31% silylium ion character. [45] Although cation 59 does not fulfil expectations, its investigation gives some answers to the question how an attack at the Si + center by solvent molecules can be hindered by bulky substituents. Since the Me3Si groups in cation 59 are rather flexible they can rotate and bend back so that an opening is formed above the Si + atom, which gives a benzene molecule sufficient space for coordination with the central Si atom. To prevent this more rigid and bulky substituents are needed.
284
Figure 16. N M R / a b initio/IGLO geometry based on PISA-HF/6-31G(d) calculations of the Wheland c~-complex (Me3Si)3Si(C6H6) + (53). Bond lengths in A, angles in degree.
8.3. Dialkylboryl substituted silylium ions in solution In view of the discussion in section 8.2, one might think of adamantyl substituents as appropriate substituents shielding the positive Si atom of a silylium ion against solvent or counterion attacks. Beside their steric bulk, adamantyl substituents would also have the advantage of leading to larger hyperconjugative stabilization of Si + compared to that found in cation 59. On the other hand, adamantyl groups would inductively withdraw electron density from the positive Si atom and, therefore, it is uncertain whether the internal stability of a triadamantylsilylium ion is higher than that of the Me3Si + ion (43 kcal/mol). These considerations led to the idea to use the bicyclic 9-borabicyclo[3.3.1]nonyl (9BBN) groups as bulky substituents in silylium ions. [46] 9-BBN substituents are commonly used in synthetic organic chemistry to enhance the stereoselectivity of a reaction. [132] They are rather rigid and, in addition, should have the advantage of stabilizing silylium ions more than silyl groups because of favourable inductive and hyperconjugative effects. To test these predictions, boryl substituted silylium ions were investigated by Ottosson and co-workers. [46] Hyperconjugation between Si + and boryl substituents is more stabilizing than in the case of silyl substituents since Si-B
285 bonds (1.98 - 2.03 A) [133] are shorter than Si-Si bonds (2.34 - 2.40 A). [117] Accordingly, orbital overlap between the 3p~(Si +) orbital and adjacent pseudo-~ orbitals is larger. In addition, the inductive effect of a boryl group should resemble that of a silyl group. [108] Most likely, boryl groups do not withdraw any electron density f r o m the p o s i t i v e Si-atom and, therefore, a tris(dialkylboryl)silylium ion has a lower electrophilicity and a better chance to resist attacks from solvent or counterions than trisilyl or trialkyl substituted silylium ions.
Figure 17. HF/6-31G(d) geometry of (9-BBN)3Si + (60). Bond lengths in A and angles in degree. [46] By attaching three 9-BBN groups to Si + the (9-BBN)3Si + cation (60) shown in Figure 17 is obtained. The calculated geometry of cation 60 reveals that three Hatoms above and three H-atoms below the plane of the Si + center hinder coordination of solvent and counterion molecules. When comparing cations 59 and 60 (Figures 15 and 17), one finds that the amount of steric hindrance is clearly larger in 60. This can be seen from the inserts in Figures 15 and 17, which indicate the size of the openings above and below the central Si atom. These openings are clearly larger in 59 than in 60. Hence a solvent molecule will have much more difficulties to bind to cation 60. Also, the internal stability of 60 (62 kcal/mol) is somewhat higher than that of 59. [46] Despite the steric bulk of the 9-BBN substituents, cation 60 can coordinate a benzene molecule thus forming the complex 54 (Figure 18). This is predicted by ab initio calculations, which lead to a binding energy of just 4 kcal/mol typical of a van der Waals complex. The Si-C distance is 2.55 A, i.e. it is 0.10 A longer than the corresponding distance in 53. The calculated charge transfer between 60 and the benzene ring as well as the electron density in the Si-C region reveal that only
286 weak interactions take place between ion and solvent molecule (Table 8). Hence, complex 54 seems to be one of the weakest Wheland c~-complexes so far described.
Figure 18. HF/6-31G(d) geometry of the c-complex (9-BBN)3Si(C6H6) + (54). Bond lengths in A and angles in degree. The 29Si NMR chemical shift value of 54 was estimated to be 572 p p m by using cation (Me2B)3Si + as a suitable model. [46] Complex (Me2B)3Si(C6H6) + has a 29Si shift value of 189 p p m and the silane (Me2B)3SiH a shift of -59 p p m according to ab initio calculations. [46] Hence, the silylium cation character of complex 54 can be estimated to be about 40% which clearly forbids to speak any longer of a free silylium cation. However, solvent coordination can further be reduced and the degree of silylium ion character increased by a) using cyclohexane rather than benzene as a solvent and b) permethylation of the 9-BBN substituents, which increases their blocking properties. Alternatively, one could use boryl substituents with phenyl groups or even adamantyl groups attached to the Batom. Synthetic routes to generate these silylium cations were discussed by Ottosson and co-workers. [46]
287 9. THE S O L U T I O N OF THE PROBLEM: FIRST G E N E R A T I O N OF A FREE SILYLIUM CATION IN C O N D E N S E D PHASES
In the years 1993 to 1996, the impression dominated that all attempts to generate silylium cations in solution would fail because of the extremely strong electrophilic character of R3Si + ions, which in solution react even with alkanes or noble gas atoms not to speak of more nucleophilic solvent molecules such as benzene, methylenechloride or sulfolane to form tetra-, penta- or even higher coordinated Si compounds. Attempts to generate R3Si + in weakly nucleophilic solvents such as benzene or toluene in the presence of TPFPB- had turned out to be a failure. [7-16] Even more discouraging was the fact that the design and test of potential silylium cations carrying large blocking groups to protect the Si + center in solution had not proven to lead to a free or nearly free silylium cation in solution. [44-47] These ions seem to possess always enough flexibility to bend away from the approaching solvent molecule thus leaving an opening in the protection shell surrounding the Si center, into which the solvent molecule could fit. Also, it was discouraging to find that enforced internal solvation will lead to a considerable loss of silylium cation character even if alkyl groups are placed next to the -SIR2+ group using templates such as 41 or the anthryl template of 48 and 49. [44] These observations led to the question whether the work on silylium cations in solution is still justified and whether one should continue to publish research results on silylium cations in the top ranking journals of chemistry. Despite of the many failures, it was in particular Lambert who continued to work on the problem ignoring all criticism and trying to get a solution with better and better chemical tools. Independent of Lambert, Cremer and co-workers carried on with both ab initio and DFT calculations to solve the question of blocking the Si + center of a silylium cation in a more efficient way against solvent attacks. The break through in the computational work was supported by two developments which on first sight had little to do with the silylium cation problem. In 1995, Olsson and Cremer [60,61] developed a new DFT-IGLO method, which is based on the sum-over-states density functional perturbation theory (SOSDFPT) of Malkin and co-workers [63] and the original Hartree-Fock IGLO method of Kutzelnigg and Schindler. [29,30] The method by Olsson and Cremer implied an accurate calculation of the Coulomb part and numerical integration of the exchange-correlation potential. The well-known deficiencies of DFT methods to lead to occupied orbitals with relatively high energies and, accordingly, to an overestimation of paramagnetic contributions to chemical shifts [64] were compensated by adding newly optimized level shift factors to orbital energy differences originally suggested by Malkin and co-workers. [65] It turned out that the new DFT-IGLO method could calculate 13C and 29Si NMR chemical shifts with an accuracy of < 3 ppm. [60-62] This provided the basis for applying a N M R / D F T / I G L O rather than a N M R / a b initio/IGLO approach in connection with the silylium ion problem.
288 In 1996, Kraka, Sosa, and Cremer [62a] investigated dimesitylcarbonyl oxide with the help of DFT calculations. Sander and co-workers [134] had synthesized this compound as the first carbonyl oxide that can be generated in solution at-78 ~ C and investigated by NMR spectroscopy. Calculation of geometry, conformation, and NMR chemical shifts of this molecule using DFT fully confirmed the results of the experimentalists. [62a] There were just a few p p m between calculated and measured NMR chemical shifts, which clearly suggested that predictions by Sander [134] concerning the nature of the molecule were correct. Although it was not immediately clear, this work was of considerable importance for the investigation of free silylium cations in solution. The calculated geometry and conformation (confirmed by the N M R / D F T / I G L O method) of dimesitylcarbonyl oxide revealed that two mesityl groups effectively stabilize a labile carbonyl oxide molecule, R2COO, and protect it against attack of nucleophilic solvent molecules. After a similar study on the dimesityl dioxirane, [62b] which is also stabilized and protected in solution by the two mesityl substituents, it became clear that mesityl groups are the ideal substituents to stabilize a silylium cation and to shield it in solution against solvent or counterion attacks, in particular then when aromatic solvents and weakly coordinating counterions are used. Of course, one can calculate many molecules that will never be experimentally investigated because of insurmountable problems in connection with their synthesis. Exactly, this seemed to be the case for the trimesitylsilylium cation Mes3Si + (61). Attempts to generate 61 via reaction (7) Ph3C+X - + Mes3Si-H
-->
Ph3C-H + Mes3Si + X-
(7)
(X = TPFPB) in benzene solution fail because the bulky mesityl groups prevent a trityl cation to come close enough to abstract a hydride ion from the silane Mes3SiH. However, Lambert and Zhao [135] solved this problem in an ingenious way. They first moved the reaction center of the silane away from the mesityl groups by introducing an allyl group. Then, they added to the double bond the carbocation Et3SiCH2CPh2 + thus forming a new t~-silyl carbocation Mes3Si-CH2+CH-CH2-CPh2CH2SiEt3. It is well-known (for example, from calculations [39]) that a Si-C bond in t~ position to a carbenium ion is very labile, which can be used to fragment Mes3Si-CH2-+CH-CH2-CPh2CH2SiEt3 exactly in a way that ion 61 is generated. Hence, Lambert and Zhao [135] manipulated the starting silane in such a way that a zipper was installed at the right position to get the wanted silylium cation. The authors generated 61 in benzene solution using TPFPB as a suitable counterion (see reaction 8): Et3SiCH2CPh2+X - + Mes3Si-CH2-CH=CH2 -> Mes3Si-CH2-+CH-CH2-CPh2CH2SiEt3 X-> Mes3Si+X - + CH2=CH-CH2-CPh2CH2SiEt3
(8)
289 The 29Si N M R chemical shift of 61 in benzene solution was m e a s u r e d to be 225.5 p p m , which suggested that Si possesses a considerable positive charge. [135] Lambert and Zhao showed further that the S29Si value did hardly change (a < 1 p p m ) w h e n other aromatic solvents were used, w h i c h caused the authors to conclude that interactions between ion 61 and solvent molecules are weak and a free silylium cation had been generated. [135] Of course, a N M R chemical shift value of 225 p p m can be i n t e r p r e t e d in different ways. One can a s s u m e that 61 represents a free silylium cation in solution that because of considerable ~-conjugation b e t w e e n 3p~(Si) orbital and the three phenyl rings is internally stabilized and, therefore, does not interact w i t h solvent molecules. The question in this case w o u l d be to which extent ~conjugation has reduced the silylium cation character and h o w m u c h positive charge is spread over the C framework. It could be that 61 represents more a carbenium ion in the same w a y as the silaguanidinium ion represents more an a m m o n i u m ion (see section 4.1).
[9 +
-1 §
l/t~,.,..."''C6H6
,l "
I i~
61a x: equilibrium value 61b a: = 90 ~
62
63a x: equilibrium value 63b I: = 90 ~
1:
Scheme 10
+
290 The other alternative would be that ~-conjugation in 61 is strongly suppressed because of steric strain between the mesityl groups and enforced rotations of these groups into a more orthogonal form. The gas phase chemical shift value of 61 could be in the region of 300 ppm similar as that of orthogonal 7 (7b, see Table 6), however interactions with the benzene molecules of the solvent used could cause an upfield shift to 225 ppm. Clearly, both situations (silylium cation with strong delocalization of positive charge or solvent complexed silylium cation) would not justify to speak of a free silylium cation in benzene solution in the case of 61. To clarify the situation Cremer and co-workers [48] carried out an extensive DFT study for both 61 and a potential benzene complex of 61 (62, Scheme 10). Results of this investigation are summarized in Figures 19, 20, and 21 as well as in Table 9. In Figures 19 and 20, the DFT-B3LYP geometry of cation 61 is shown for its equilibrium form 61a (Figure 19)and for form 61b (Figure 20), in which the mesityl groups are rotated by 90 ~ with regard to the plane defined by the SiC3 unit (compare with Scheme 10 for a definition of the rotation angle ~). Cation 61a possesses C3 symmetry and is characterized by a planar SiC3 unit (C-Si-C angle: 120 ~ at its center. The mesityl groups are rotated each by 9 = 47.3 ~ out of the SiC3 reference plane in the way of a propeller thus avoiding close contacts between the o-positioned methyl groups. The closest H,H contact between two o - m e t h y l groups of different mesityl substituents is 2.85 A while the closest H,H contacts within a mesityl group are 2.38 and 2.28 A (see Figure 19), which is in the range of typical van der Waals distances between H atoms (2.2 - 2.4 A). [107] The barrier of 61a for a synchronous rotation of the three mesityl propeller blades defined by the relative energy of 61b (~ = 90 ~ is 32.1 kcal/mol, which is significantly higher than the corresponding barrier of the triphenylsilylium cation (63, 26.2 kcal/mol, Table 9 [48]). The relatively high barrier of 61a is caused by close H,H contacts of just 2.06 and 2.10 A between the o-methyl groups (Figure 20), which cannot be avoided if one or all mesityl groups rotate. Hence, rotation of the mesityl groups and averaging of its NMR chemical shifts can be excluded for 61a at room temperature. [48] Cremer and co-workers [48] calculated a DFT-IGLO 829Si value of 226.3 ppm for 61a, which differs from the experimental one (225.5 p p m [135]) measured in benzene solution by just 0.8 ppm. As can be seen from Figure 20 suppressing of ~conjugation in a rotated form and, at the same time close contacts in a sterically crowded form such as 61b would drastically change this value to 375 ppm. [48] However, interactions with a solvent molecule in benzene solution have almost now influence on the chemical shift value since the closest contact distance between 61 and benzene is 5.87 A and, by this, far outside the range of typical Si,C van der Waals distances. Also, the calculated binding energy aE of 1.8 kcal/mol indicates only very weak solvent-solute interactions. Cremer and co-workers [48] assessed the degree of electron delocalization in 61a by comparing aryl-substituted silylium cations with regard to calculated rotational energies, the calculated electron population of the 3p~(Si) orbital, the 13C NMR chemical shift of the p-positioned C atom, and the 29Si NMR chemical shift values (Table 9). They came to the conclusion that about 70% ~-delocalization of
Table 9 DFT-B3LYP/6-31G(d) energies, DFT-IGL0/[7~6p2d/5~4pld/3s] NMR chemical shifts, and Si charges of aryl substituted silylium cations. a #
Molecule
61a 61b 62 63a 63b
SiMes3+ SiMes3+ S&feS3+.C6H6 SiPb+ SiPb+
Cod
Sym
EnergyAE
629Si
3px(Si)
+q(Si)
613C b
2 = 47.3
c3
2 = 90
c3
2 = 45-48
c1
2 = 90
D3 D3h
0 32.1 1.8 0 26.2
226.3 373.5 227.6 193.5 279.9
404 128 402 488 123
541 648 545 552 614
156.2 158.4 155.5 142.0 136.5
2 = 27.2
a Dihedral angles z in deg (for the definition of the dihedral angle, see Scheme lo), energy differences in kcal/mol, NMR chemical shifts in ppm relative to TMS from Ref. 48. The electron population of the 3px(Si) orbital and the positive charge of the Si atom (+q) are given in melectron. b p-Positioned C atom in an aryl group.
292
Figure 19. B3LYP geometry and DFT-IGLO NMR chemical shifts of the trimesityl silylium cation in its equilibrium form (61a). [48] Geometrical parameters are given for the mesityl group in the upper right part. Some non-bonded distances between H atoms are indicated for the mesityl group in the upper left part. NMR chemical shifts are given for the mesityl group at the bottom. Distances in A, angles in deg, NMR chemical shifts relative to TMS in ppm.
293
Figure 20. B3LYP geometry and DFT-IGLO NMR chemical shifts of form 61b. [48] Geometrical parameters are given for the mesityl group on the right side. Some non-bonded distances between H atoms are indicated in the lower part of the drawing. NMR chemical shifts for the mesityl group on the left are also shown. Distances in A, angles in deg, NMR chemical shifts relative to TMS in ppm.
294
Figure 21. B3LYP geometry and DFT-IGLO NMR chemical shifts of complex 62. [48] Geometrical parameters are given for the mesityl group and the benzene molecule on the right side. Some non-bonded distances between 61 and benzene are indicated by dashed lines. NMR chemical shifts for the mesityl group on the left are also shown. Distances in A, angles in deg, NMR chemical shifts relative to TMS in ppm.
295 an ideal triphenyl or trimesityl silylium cation with x = 0 ~ is retained in cation 61a. Calculational results clearly revealed that a SiC contact that leads to any weak covalent interactions between the Si + center and the benzene molecule in the sense of a Wheland ~-complex is impossible in the case of 61. The agreement of the DFT-IGLO 629Si NMR chemical shift values of 226-227 ppm (61 and 62) with the measured value of 225.5 ppm, [135] the low stability of complex 62 (1.8 kcal/mol) and the large Si,Cbenzene contact distance of 5.9 A strongly suggest that 61 in benzene solution represents a free silylium cation. Clearly, the silylium cation character is reduced by x-conjugation and delocalization of positive charge over the mesityl rings. The 3p~(Si) orbital of 61 is filled up by just 40%, i.e. 30% more than the orthogonal form 61b with no x-conjugation (but some hyperconjugation accounting for a 10% filling of the 3p~(Si) AO, Table 9). Also, the LUMO of 61 is dominated by the 3p~(Si) orbital, which makes 61 a true silylium cation according to the criteria discussed in section 4.1. Therefore, one can speak in the case of 61 of the first free silylium cation ever synthesized in solution. [48,135] The synthesis of 61 by Lambert and Zhao [135] proofs that it is possible to generate free silylium cations in solution. The success of this work is based on fulfilling the requirements listed in section 8, namely (1) internal stabilization of the cation, (2) steric blocking of the Si + center, (3) use of a weakly nucleophilic solvent with (4) not too small solvent molecules, and (5,6) use of bulky, weakly coordinating counterions with delocalized anionic charge. In the way it is possible to generate free silylium cations in solution it is also possible to investigate and analyze the solvation process (see section 1.1.) that lead from a free silylium cation to a solvated, strongly coordinated silyl cation. One can test what happens when the solvent is changed from benzene to, e.g., methylenechloride or acetonitrile [135]. In these cases, solvent molecules are small enough to squeeze into openings between the protecting mesityl groups and to coordinate to Si +. It should be possible by varying the size and donicity of the solvent molecules to set up a scale of different binding energies aE and 29Si NMR chemical shifts for complexes Mes3Si(S) +. In addition, one can reduce stepwise the steric bulk of the aryl substituents and to investigate how this will influence the solvation process. Hence, the synthesis of 61 has provided an excellent basis for future work on the solvation process with both experimental and theoretical means.
ACKNOWLEDGEMENT This work was supported by the Swedish Natural Science Research Council (NFR). All calculations were done on a CRAY YMP/416 and C94 of the Nationellt Superdatorcentrum (NSC), Link6ping, Sweden. The authors thank the NSC for a generous allotment of computer time.
296 REFERENCES
o
o
8 9. 10. 11. 12. 13. 14. 15. 16. 17.
18. 19. 20. 21.
S. Winstein, D.S. Trifan, J. Am. Chem. Soc., 71 (1949) 2953. The extensive research on the norbornyl cation is discussed in a series of reviews: (a) C.A. Grob, Acc. Chem. Res., 16 (1983) 426. (b) H.C. Brown, Acc. Chem. Res., 16 (1983) 432 (c) G.A. Olah, G.K.S. Prakash, M. Saunders, Acc. Chem. Res., 16 (1983) 440. (d) C. Walling, Acc. Chem. Res., 16 (1983) 448. We use the term silylium ion for R3Si +, rather than silicenium ion, silylenium ion, or silyl cation, thus following IUPAC recommendations. See Nomenclature of Inorganic Chemistry: G.J. Leigh, Ed.; Blackwell: Oxford, U.K., 1990, p. 106. For recent reviews on R3Si + in solution see (a) J.B. Lambert, L. Kania, S. Zhang, Chem. Rev., 95 (1995) 1191. (b) J. Chojnowski, W. Stanczyk, Main Group Chem. News, 2 (1994) 6. (c) P.D. Lickiss, J. Chem. Soc. Dalton Trans., (1992) 1333. (d) J. Chojnowski, W. Stanczyk, Adv. Organomet. Chem., 30 (1990) 243. H. Schwarz in The Chemistry of Organic Silicon Compounds, S. Patai, Z. Rappoport, Eds.; Wiley Interscience; New York, 1989, p. 445, and references cited therein. C. Maerker, J. Kapp, P.v.R. Schleyer in Organosilicon Chemistry II: From Molecules to Materials, N. Auner, J. Weis; Eds.; VCH, Weinheim, 1996, p. 329. J.B. Lambert, S, Zhang, J. Chem. Soc., Chem. Comm., (1993) 383. J.B. Lambert, S. Zhang, C.L. Stern, J.C. Huffman, Science, 260 (1993) 1917. J.B. Lambert, S. Zhang, S.M. Ciro, Organometallics, 13 (1994) 2430. J.B. Lambert, S. Zhang, Science, 263 (1994) 986. Z. Xie, D.J. Liston, T. Jelinek, V. Mitro, R. Bau, C.A. Reed, J. Chem. Soc., Chem. Comm., (1993) 384. C.A. Reed, Z. Xie, R. Bau, A. Benesi, Science, 262 (1994) 402. C.A. Reed, Z. Xie, Science, 263 (1994) 986. Z. Xie, R. Bau, C.A. Reed, J. Chem. Soc., Chem. Commun., (1994) 2519. Z. Xie, R. Bau, A. Benesi, C.A. Reed, Organometallics, 14 (1995) 3933. Z. Xie, J. Manning, R.W. Reed, R. Mathur, P.D.W. Boyd, A. Benesi, C.A. Reed, J. Am. Chem. Soc., 118 (1996) 2922. See e.g. (a) P.v.R. Schleyer, S. Sieber, Angew. Chem. Int. Ed. Engl., 32 (1993) 1606. (b) P.R. Schreiner, D.L. Severence, W.L. Jorgensen, P.v.R. Schleyer, H.F. Schaefer III, J. Am. Chem. Soc., 117 (1995) 2663. (c) S.A. Perera, R.J. Bartlett, J. Am. Chem. Soc., 117 (1995) 8476. G.A. Olah, L. Heiliger, X.-Y. Li, G.K.S. Prakash, J. Am. Chem. Soc., 112 (1990) 5991. G.K.S. Prakash, S. Keyaniyan, R. Aniszfeld, L. Heiliger, G.A. Olah, R.C. Stevens, H.K. Choi, R. Bau, J. Am. Chem. Soc., 109 (1987) 5123. G.A. Olah, G. Rasul, L. Heiliger, J. Bausch, G.A. Olah, J. Am. Chem. Soc., 114 (1992) 7737. G.A. Olah, G. Rasul, L. Heiliger, J. Bausch, G.K.S. Prakash, Science, 263 (1994) 983.
297 22. 23. 24. 25
26. 27. 28. 29. 30. 31. 32. 33.
34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47.
G.A. Olah, G. Rasul, H.A. Buchholz, X.-Y. Li, G.K.S. Prakash, Bull. Chim. France, 132 (1995) 569. G.A. Olah, G. Rasul, G.K.S. Prakash, J. Organomet. Chem., 521 (1996) 271. (a) J. Kapp, P.R. Schreiner, P.v.R. Schleyer, J. Am. Chem. Soc., 118 (1996) 12154. (b) P.v.R. Schleyer, Science, 275 (1997) 39. (c) A. Sekiguchi, M. Tsukamoto, M. Ichinohe, Science, 275 (1997) 60. (a) C. Reichardt, Solvents and Solvent Effects in Organic Chemistry, VCH, Weinheim, 1988. (b) The Chemical Physics of Solvation, R.R. Dogonadze, E. Kalman, A.A. Kornyshev, J. Ulstrup, Edts., Elsevier, Amsterdam, 1985. (c) V. Gutman, Coord. Chem. Rev. 18 (1976) 225. The term was coined first by Schleyer [33], however the same approach was independently used by Schleyer and Cremer at the same time. See, e.g., D. Cremer, L. Olsson, F. Reichel, E. Kraka, Isr. J. Chem., 33 (1993) 369. L. Pauling, Science, 263 (1994) 983. (a) R. Ditchfield, J. Chem. Phys., 56 (1972) 5688. (b) R. Ditchfield, Mol. Phys., 27 (1974) 789. W. Kutzelnigg, Isr. J. Chem., 19 (1980) 193. M. Schindler, W. Kutzelnigg, J. Chem. Phys., 76 (1982) 1919. W. Kutzelnigg, M. Schindler, U. Fleicher in NMR, Basic Principle and Progress, P. Diehl, Edt.; Springer, Berlin, 1991, Vol. 23, p. 165. (a) M. Schindler, W. Kutzelnigg, J. Am. Chem. Soc., 105 (1983) 1360. (b) M. Schindler, W. Kutzelnigg, Mol. Phys., 48 (1983) 781. See, e.g., (a) P. Buzek, P.v.R. Schleyer, S. Sieber, Chem. Unserer Zeit, 26 (1992) 116. (b) M. B6hl, P.v.R. Schleyer in Electron Deficient Boron and Carbon Clusters, Eds.: G.A. Olah, K. Wade, R.E. Williams, Wiley, New York, 1991. (c) M. Bfihl, N.J.R. Hommes, P.v.R. Schleyer, U. Fleicher, W. Kutzelnigg, J. Am. Chem. Soc., 113 (1991) 2459. F. Reichel, Ph.D. Thesis, University of K61n, Germany, 1991. (a) P. Svensson, F. Reichel, P. Ahlberg, D. Cremer, J. Chem. Soc. Perkin Trans. II, (1991) 1463. (b) D. Cremer, P. Svensson, E. Kraka, P. Ahlberg, J. Am. Chem. Soc., 115 (1993) 7445. S. Sieber, P.v.R. Schleyer, A.H. Ottos, J. Gauss, F. Reichel, D. Cremer, J. Phys. Org. Chem., 6 (1993) 445. D. Cremer, F. Reichel, E. Kraka, J. Am. Chem. Soc., 113 (1991) 9459. D. Cremer, L. Olsson, F. Reichel, E. Kraka, Isr. J. Chem., 33 (1993) 369. L. Olsson, D. Cremer, Chem. Phys. Lett., 215 (1993) 413. D. Cremer, L. Olsson, H. Ottosson, J. Mol. Struct. (Theochem), 313 (1994) 91. L. Olsson, C.-H. Ottosson, D. Cremer, J. Am. Chem. Soc., 117 (1995) 7460. L. Olsson, Ph.D. Thesis, University of G6teborg, Sweden, 1996. M. Arshadi, D. Johnels, U. Edlund, C.-H. Ottosson, D. Cremer, J. Am. Chem. Soc., 118 (1996) 5120. C.-H. Ottosson, D. Cremer, Organometallics, 15 (1996) 5309. C.-H. Ottosson, D. Cremer, Organometallics, 15 (1996) 5495. C.-H. Ottosson, K. Szabo, D. Cremer, Organometallics, 16 (1997) 2377. C.-H. Ottosson, Ph.D. Thesis, University of G6teborg, Sweden, 1996.
298 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59.
60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.
E. Kraka, C. Sosa, J. Gr/ifenstein, C.-H. Ottosson, D. Cremer, Chem. Phys. Lett., in press. R.F. Childs, Acc. Chem. Res., 17 (1984) 347. (a) R.F. Childs, R. Faggiani, C.J.L. Lock, M. Mahendran, J. Am. Chem. Soc., 108 (1986) 3613. (b) R.F. Childs, A. Varadarajan, C.J.L. Lock, R. Faggiani, C.A. Fyfe, R.E. Wasylishen, J. Am. Chem. Soc., 104 (1982) 2452. (a) R.C. Haddon, J. Org. Chem., 44 (1979) 3608. (b) R.C. Haddon, J. Am. Chem. Soc., 110 (1988) 1108. For a review, see D. Cremer, R.F. Childs, E. Kraka in The Chemistry of the Cyclopropyl Group, Vol. 2, Z. Rappoport, Ed.; John Wiley and Sons Ltd; New York, 1995, p. 339. See, e.g., S.A. Perera, M. Nooijen, R. Bartlett, J. Chem. Phys. 104 (1996) 3290. M. B6hl, T. Steinke, P.v.R. Schleyer, R. Boese, Angew. Chem. Int. Ed. Engl., 30 ( 1991 ) 1160. L.R. Thorne, R.D. Suenram, F.J. Lovas, J. Chem. Phys., 78 (1983) 167. (a) S.G. Shore, R.W Parry, J. Am. Chem. Soc., 77 (1955) 6084. (b) E.W. Hughes, J. Am. Chem. Soc., 78 (1956) 502. (c) E.L. Lippert, W.N. Lipscomb, J. Am. Chem. Soc., 80 (1958) 503. H. N6th, B. Wrackmeyer, Chem. Ber., 107 (1974) 3070. (a) M. B6hl, J. Gauss, M. Hofmann, P.v.R. Schleyer, J. Am. Chem. Soc., 115 (1993) 12385. (b) S. Sieber, P.v.R. Schleyer, J. Am. Chem. Soc., 115 (1993) 6887. (a) J. Gauss, J. Chem. Phys., 99 (1993) 3629. (b) J. Gauss, J.F. Stanton, J. Chem. Phys., 103 (1995) 3561. (c) J. Gauss, J.F. Stanton, J. Chem. Phys., 104 (1996) 2574. (d) O. Christiansen, J. Gauss, J.F. Stanton, Chem. Phys. Lett., 266 (1997) 53. L. Olsson, D. Cremer, J. Chem. Phys., 105 (1996) 8995. L. Olsson, D. Cremer, J. Phys. Chem., 100 (1996) 16881. (a) E. Kraka, C. P. Sosa, D. Cremer, Chem. Phys. Lett., 260 (1996) 43. (b) K. Schroeder, W. Sander, R. Boese, S. Muthusamy, A. Kirschfeld, E. Kraka, C. Sosa, D. Cremer, J. Am. Chem. Soc., 119 (1997) 7265. V.G. Malkin, O.L. Malkina, and D.R. Salahub, Chem. Phys. Lett., 204 (1993) 80. A.M. Lee, N.C. Handy, S.M. Coldwell, J. Chem. Phys., 103 (1995) 10095. V.G. Malkin, O.L. Malkina, M.E. Casida, D.R. Salahub, J. Am. Chem. Soc., 116 (1994) 5898. A.D. Becke, Phys. Rev. A, 38 (1988) 3098. J.P. Perdew, Y. Wang, Phys, Rev. B, 45 (1992) 13244. A.D. Becke, J. Chem. Phys., 98 (1993) 5648. P.J. Stevens, F.J. Devlin, C.F. Chablowski, M.J. Frisch, J. Phys. Chem., 98 (1994) 11623. C.W. Bauschlicher Jr., H. Partridge, Chem. Phys. Lett., 240 (1995) 533. (a) R. Gutbrod, R.N. Schindler, E. Kraka, D. Cremer, Chem. Phys. Lett., 252 (1996) 221. (b) R. Gutbrod, E. Kraka, R.N. Schindler, D. Cremer, J. Am.
299
72.
73. 74.
75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87.
88.
89. 90. 91. 92.
Chem. Soc., 119 (1997) 7330. (c) E. Kraka, D. Cremer, J. Fowler, and H.F. Schaefer, J. Am. Chem. Soc., 118 (1996) 10595. (a) S. Miertus, E. Scrocco, J. Tomasi, J. Chem. Phys., 55 (1981) 117. (b) R. Bonaccorsi, R. Cimiraglia, J. Tomasi, J. Comp. Chem., 4 (1983) 567. (c) J.L. Pascual-Ahuir, E. Silla, J. Tomasi, R. Bonaccorsi, J. Comp. Chem., 8 (1987) 778. J.B. Foresman, T.A. Keith, K.B. Wiberg, J. Snoonian, M.J. Frisch, J. Phys. Chem., 100 (1996) 16098. Gaussian92, M.J. Frisch, M. Head-Gordon, G.W. Trucks, J.B. Foresman, H.B. Schlegel, K. Raghavachari, M.A. Robb, J.S. Binkley, C. Gonzalez, D.J. Defrees, D.J. Fox, R.A. Whiteside, R. Seeger, C.F. Melius, J. Baker, R.L. Martin, L.R. Kahn, J.J.P. Stewart, S. Topiol, J.A. Pople, Gaussian 92, Gaussian Inc., Pittsburgh Pa., 1992. K. M. Gough, J. Chem. Phys., 91 (1989) 2424. (a) J. Alml6f, K. Jr. Faegri, K. Korsell, J. Comp. Chem., 3 (1982) 385. (b) D. Cremer, J. Gauss, J. Comp. Chem., 7 (1986) 274. (c) M. H/iser, R. Ahlrichs, J. Comp. Chem., 10 (1989) 104. COLOGNE96, E. Kraka, J. Gauss, F. Reichel, L. Olsson, Z. He, Z. Konkoli, Cremer, D., University of G6teborg, 1996. R.J.P. Corriu, M. Henner, J. Organomet. Chem., 74 (1974) 1, and references cited therein. For a review on hypercoordinated Si compounds, see C. Chuit, R.J.P. Corriu, C. Reye, J.C. Young, Chem. Rev., 93 (1993) 1371. J.B. Lambert, W.J.J. Schulz, J. Am. Chem. Soc., 105 (1983) 1671 J.B. Lambert, J.A. McConnell, W.J.J. Schulz, J. Am. Chem. Soc., 108 (1986) 2482. J.B. Lambert, W. Schilf, J. Am. Chem. Soc., 110 (1988) 6364. J.B. Lambert, W.J.J. Schulz, J.A. McConnell, W. Schilf, J. Am. Chem. Soc., 110 (1988) 2201. J.B. Lambert, L. Kania, W. Schilf, J.A. McConnell, Organometallics, 10 (1991) 2578. J.Y. Corey, J. Am. Chem. Soc., 97 (1975) 3237. See, e.g., C. Eaborn, J. Organomet. Chem., 405 (1991) 173. (a) G.A. Olah, G.K.S. Prakash, J. Sommer, Superacids, Wiley' Interscience, New York, 1985. (b) Carbonium Ions, Vols. I - V, G.A. Olah, P.v.R. Schleyer, Eds.; Wiley Interscience, New York, 1968 - 1976. (c) G.A. Olah, Angew. Chem. Int. Ed. Engl, 34 (1995) 1393. (a) P.v.R. Schleyer, W. Koch, B. Lin, U. Fleicher, J. Chem. Soc. Chem. Commun. (1989) 1098. (b) J.B. Nicholas, T. Xu, D.H. Barich, P.D. Torres, J.F. Haw, J. Am. Chem. Soc., 118 (1996) 4202. (c) S. Sieber, P. Buzek, P.v.R. Schleyer, W. Koch, J. W.M. Carneiro, J. Am. Chem. Soc., 115 (1993) 259. (d) P.v.R. Schleyer, C. Maerker, Pure & Appl. Chem., 67 (1995) 755. M. Saunders, H.A. Jim6nez-Vazquez, Chem. Rev., 91 (1991) 375. Y. Marcus, Ion Solvation, Wiley Interscience, New York, 1985. J. Cioslowski, M. Martinov, J. Chem. Phys., 103 (1995) 4967. U. Pidun, M. Stahl, G. Frenking, Chem. Eur. J., 2 (1996) 869.
300 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109.
110. 111. 112
113.
114. 115. 116. 117. 118.
C.-H. Ottosson, D. Cremer, unpublished results. M.W. Schmidt, P.N. Truong, M.S. Gordon, J. Am. Chem. Soc., 109 (1987) 5217. E. Kraka, Ph.D. Thesis, Univ. of K61n, Germany, 1984. R.J. Gillespie, T.E. Peel, J. Am. Chem. Soc., 95 (1973) 5173. J. Bacon, P.A.W. Dean, R.J. Gillespie, Can. J. Chem., 47 (1969) 1655. T. Laube, Angew. Chem. Int. Ed. Engl., 25 (1986) 349 S. Hollenstein, T. Laube, J. Am. Chem. Soc., 115 (1993) 7240 T. Laube, Helv. Chim. Acta, 77 (1994) 943 T. Laube, E. Schaller, Acta Cryst. B, (1995)117 T. Laube, Acc. Chem. Res., 28 (1995) 399. D.P. Kelly, D.R. Leslie, J. Am. Chem. Soc., 112 (1990) 4268. R.J. Gillespie, F.G. Ridell, D.R. Slim, J. Am. Chem. Soc., 98 (1976) 8069. G.A. Olah, D.J. Donavan, J. Am. Chem. Soc., 100 (1978) 5163. E. Kraka, D. Cremer, unpublished results. Lange's Handbook of Chemistry, J.A. Dean, Edt., McGraw-Hill, New York, 1979. N. Inamoto, S. Masuda, Chem. Lett., (1982) 1003. (a) E. Kraka, D. Cremer in Theoretical Models of the Chemical Bond, Part 2: The Concept of the Chemical Bond, Z.B. Maksic, Ed.; Springer, 1990, p. 453. (b) D. Cremer and E. Kraka, Angew. Chem. Int. Edt. Engl., 23 (1984) 627. (c) D. Cremer, E. Kraka, Croatica Chem. Acta, 57 (1984) 1259. S.H. Strauss, Chem. Rev. 93 (1993) 927. B.T. King, Z. Janousek, B. Griiner, M. Trammell, B.C. Noll, J. Michl, J. Am. Chem. Soc. 118 (1996) 3313. (a) G.A. Olah, G.K.S. Prakash, R. Krishnamurti, Adv. Silicon Chem., 1 (1991) 1. (b) H. Emde, D. Domsch, H. Feger, U. Frick, A. G6tz, H.H. Hergott, K. Hofmann, W. Kober, K. Kr/igeloh, T. Oesterle, W. Steppen, W. West, G. Simschen, Synthesis, (1982) 1. (a) A.R. Bassindale, J. Jiang, J. Organomet. Chem. 446 (1993) C3. (b) A.R. Bassindale, T. Stout, J. Organomet. Chem. 271 (1984) C1. (c) A.R. Bassindale, T. Stout, Tetrahedron Lett., 26 (1985) 3403. (d) A.R. Bassindale, T. Stout, J. Chem. Soc. Perkin Trans. II, (1986) 221. (e) A.R. Bassindale, T. Stout, J. Chem. Soc. Chem. Comm., (1984) 1387. (f) A.R. Bassindale, T. Stout, J. Organomet. Chem. 238 (1982) C41. (a) M. Kira, T. Hino, H. Sakurai, J. Am. Chem. Soc., 114 (1992) 6697. (b) M. Kira, T. Hino, H. Sakurai, Chem. Lett., (1993) 153. S. Bahr, P. Boudjouk, J. Am. Chem. Soc., 115 (1993) 4514. G.A. Olah, L. Xing-Ya, Q. Wang, G. Rasul, G.K.S. Prakash, J. Am. Chem. Soc., 117 (1995) 8962. W.S. Shieldrick in The Chemistry of Organic Silicon Compounds, Eds.: S. Patai, Z. Rappoport, Wiley Interscience, U.K., 1989. p. 227. (a) K. Hensen, T. Zengerly, P. Pickel, G. Klebe, Angew. Chem., 95 (1983) 739. (b) K. Hensen, T. Zengerly, T. Miiller, P. Pickel, Z. Anorg. Allg. Chem., 558 (1988) 21.
301 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133.
134. 135.
J. Belzner, D. Sch/ir, B.O. Kneisel, R. Herbst-Irmer, Organometallics, 14 (1995) 1840. C. Breliere, F. Carr6, R.J.P. Corriu, M.W.C. Man, J. Chem. Soc., Chem. Comm., (1994) 2333. C. Chuit, R.J.P. Corriu, A. Mehdi, C. ReyG Angew. Chem. Int. Ed. Engl., 32 (1993) 1311. F. Carr6, C. Chuit, R.J.P. Corriu, A. Mehdi, Angew. Chem. Int. Ed. Engl., 37 (1994), 1097. M. Chauhan, C. Chuit, R.J.P. Corriu, C. ReyG Tetrahedron Lett. 37 (1996) 845. M. Chauhan, C. Chuit, R.J.P. Corriu, A. Mehdi, C. Rey6, Organometallics, 15 (1996) 4326. (a) J.T.B.H. Jastrezebski, G. van Koten, Adv. Organomet. Chem. 35 (1993) 241. (b) G. van Koten, Pure Appl. Chem., 62 (1990) 1155. (a) G. van Koten, Pure & Appl. Chem. 61 (1989) 1681 (b) G. van Koten, J.T.B.H. Jastrzebski, J.G. Noltes, A.L. Spek, J.C. Schoone, J. Organomet. Chem. 148 (1978) 233. V.A. Benin, J.C. Martin, M.R. Willcott, Tetrahedron Lett. 35 (1994) 2133. Y. Apeloig, O. Merin-Aharoni, D. Danovich, A. Ioffe, S. Shaik, Isr. J. Chem., 33 (1993) 387. (a) C.-H. Hu, P.R. Schreiner, P.v.R. Schleyer, H.F. Schaefer III, J. Phys. Chem., 98 (1994) 5040. (b) C.-H. Hu, M. Shen, H.F. Schaefer, Chem. Phys. Lett., 190 (1992) 543. K. Hiraoka, J. Katsuragawa, A. Minamitsu, Chem. Phys. Lett., 1997 (267) 580. P.v.R. Schleyer, T. Mfiller, Y. Apeloig, H.-U. Siehl, Angew. Chem. Int. Ed. Engl., 32 (1993) 1471. A. Pelter, K. Smith, H.C. Brown, Borane Reagents, Academic Press, New York, 1988. (a) R.J. Wilczek, D.S. Matteson, J.G. Douglas, J. Chem. Soc. Chem. Comm., (1976) 401 (b) J. Pfeiffer, W. Maringgele, A. Meller, Z. Anorg. Allg. Chem., 511 (1984) 185 (c) A. Blumentahl, P. Bissinger, H. Schmidbaur, J. Organomet. Chem., 462 (1993) 107, and references cited therein. A. Kirschfeld, S. Muthusamy, W. Sander, Angew. Chem. Int. Ed. Engl. 33 (1994) 2212. J.B. Lambert and Y. Zhao, Angew. Chem. 109 (1997) 389.
This Page Intentionally Left Blank
Z.B. Maksid and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
303
B o n d e n e r g i e s , e n t h a l p i e s of formation, a n d homologies: t h e e n e r g e t i c s of a l i p h a t i c a n d alicyclic h y d r o c a r b o n s a n d some of t h e i r d e r i v a t i v e s Suzanne W. Slayden a and Joel F. Liebman b aDepartment of Chemistry, George Mason University, 4400 University Drive, Fairfax, VA 22030, USA bDepartment of Chemistry and Biochemistry, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA Tetracoordination, tetrahedral geometry and sp3-hybridization are three key concepts in organic chemistry. Likewise, homologous series are useful for obtaining information about related organic molecules. In the current study we present an analysis of 1-substituted alkanes as an archetypical homologous series defined by an increasing number of methylene groups. The series of cycloalkanes are also characterized by such increasing numbers although they do not behave homologously. Finally, we present an analysis of the enthalpies of formation of tetrahedrane and [1.1.1]propellane in terms of homologous series where this concept is almost as strained as the molecules. 1. T E T R A C O O R D I N A T I O N , T E T R A H E D R A L G E O M E T R Y AND HYBRIDIZATION Tetracoordinated, tetrahedral, spa-hybridized carbon is a principal structural component introduced early in the study of organic chemistry. Tetracoordinate and tetrahedral are not synonyms, however. The wondrous structural diversity of strained hydrocarbons belies that plausible equivalence. Tetrahedral shape and spa-hybridized orbitals are not necessarily related either. Although the latter is often asserted to be a primary electronic structure component for the organic chemist, atomic hybridization is now well-established to be far more subtly varied than the 1:3 s:p ratio this description denotes. This can be unequivocally demonstrated, for example, by analysis of 13C-H and 13C-13C NMR coupling constants [1] and from orbital populations derived from high quality ab initio calculations [2] on GH3X species (G = C, Si, Ge, Sn and Pb, X = H and numerous monoatomic and polyatomic groups). Yet the concepts of structure and energy; of hybrids and geometry; of atoms and molecules; of bonds and orbitals are truly interwoven and inseparable, and the terms "tetrahedral carbon", "tetracoordinate carbon" and "sp a hybridized carbon" are very often used synonymously and interchangeably.
304 2. T H E N U M B E R O F C O M P O U N D S A N D T H E N E C E S S I T Y F O R I N T E R C O N N E C T I O N S , H O M O L O G I E S AND H O M O L O G O U S S E R I E S We acknowledge the number of different molecular species that have been chronicled by the practicing chemical community (>107). We likewise acknowledge the impressively large number of different species t h a t are easily hypothesized even by a beginning student that still elude the most experienced experimentalist. We thus admit to the fervent hope that these compounds --known and unknown-- are conceptually interconnected. That is, despite the inherent intricacies of the SchrSdinger equation, spectra, and synthesis; of melting points, many-body effects, and multipoles; of color, crystal packing, and chirality; the acquired chemical knowledge of one species (or of some selected, and so defined as a "sma_llish" number of, species) is presumed to qualitatively and quantitatively relate to knowledge about others. As chemists, we often speak of molecular homologies and homologous series. In this essay, homologous series are taken to be a collection of molecules that are related by constant changes in the number of groups (e.g. methylene or -CH2-, (Z)-disubstituted vinylene or -CH=CH-) and of larger structural features (e.g. the number of three-membered rings or 3MRs). We will limit our attention to molecular energetics such as bond energies and enthalpies of formation [3] in the gas phase and bypass considerations of molecular geometries such as bond lengths and angles. The decision as to topic and to phase simplifies our understanding and aids our predictions -- we discuss only intramolecular interactions. 3. H O M O L O G O U S S E R I E S : THE 1-SUBSTITUTED A L K A N E S We now turn to our first homologous series of organic compounds, 1substituted n-alkanes: CH3X, CH3CH2X, CH3CH2CH2X . . . . CH3(CH2),X . . . . . Members of this series are composed of tetracoordinate, almost tetrahedral, and putatively sp 3 hybridized carbon atoms. Their structures differ by a constant CH2, or methylene group. The difference between the gas phase enthalpies of formation of successive members of this series, i.e. 51(n, X)= [AHf(CH3(CH2),+IX) - AH~CH3(CH2)nX)]
(1)
approaches a constant, but the constant itself depends more or less on X, and unequivocally for n = 1. If the principle of bond additivity were completely valid and experiments were without flaws or "glitches", the quantity 51(n, X) would be a constant independent of the value of n and of the substituent X. This is simply shown if we recast enthalpies of formation in terms of bond energies where the sum of bond energies for a molecule may be thermodynamically equated to the energy of atomization. Assuming both the enthalpies of formation and atomization energies refer to the molecule at the same temperature, a simple correction for
305 energies of atomization of the elements and for the ca. 1-RT difference between energy and enthalpy (2.4 kJ mol") interconverts the two quantities. Ignoring all internal structure of X because it is both complicated to write and irrelevant in the current context, we find the total bond energy of CH3X equals 3D(C-H) + D(C-X) where D(C-H) and D(C-X) are the average C-H and C-X bond energies. Strict bond additivity results in the situation in which the average C-H and C-X bond energies would equal the C-H and C-X bond energies as found in any arbitrary molecule and so the total bond energy of CH3CH2X would equal precisely 5D(C-H) + D(C-C) + D(C-X). The difference between the total bond energies of CH3CH2X and CH3X would then equal 2D(C-H) + D(C-C), a quantity independent of the choice of X. It may readily be shown that the difference between the sum of the bond energies of CH3CH2X and CH3X also equals the difference between the sum of the bond energies of CH3(CH2),+,X and CH3(CH2),X: perhaps the simplest proof replaces X in the first pair of molecules by (CH2),X in the second. We would also conclude that the following reaction is thermoneutral for all pairs of substituents X and Y: CH3(CH2)mX
+ CHa(CH2)pY
-+ CHa(CH2).X
+ CHs(CH2)qY
(2)
for all nonnegative integers m, n, p and q such that m + p = n + q. However, these conclusions are not supported upon examination of the thermochemical literature on n-alkyl derivatives [4]. A practicable approach to studying the enthalpy-of-formation differences in eq. 1 recognizes that the gaseous and liquid enthalpies of formation for compounds in a homologous series are linearly correlated with the number of carbon (no) atoms in the molecules. For compounds in the gaseous state AH~ (g) = (o~ 9m) + 13
(3)
A clear exception in almost every series is the methyl-substituted compound which deviates from the otherwise linear relationship of the other members [5]. For 1-substituted n-alkanes, the slope, ~, in eq. 3 is called the "methylene increment" because it is the statistically-derived difference between the enthalpies of formation of successive homologs in a series due to the structural difference between them -- a methylene group. Although early workers [6] attempted to define a Universal methylene increment derived solely from the nalkane series (-20.61 kJ/mol) which would be applicable to other homologous series, it was soon apparent as additional data became available that the nalkanol series was significantly discrepant [7]. Extensive evaluation [8] of eleven homologous series showed the methylene increment to be statistically identical (-20.67 + 0.05 kJ/mol) for all the series (no > 2) except for the n-alkanols (-20.178 + 0.40 kJ/mol) and n-alkyl bromides (-20.241 + 0.118 kJ/mol). Our more recent analyses [9] using yet newer data, show little change but they do demonstrate the sensitivity of the regression equation results to the chosen experimental data
306 (see examples in Table 1). Analysis of series for which data are available only for nc < 4 typically gives disparate slopes [10]. Table 1. Constants from the linear regression homologous series (kJ mol") Homologous Series
C,r
analysis of equation
3 for several
standard error in
a
AH~ n-RH n-ROH n-RSH n-RC1 n-RBr
C4-C12, C,6, C,8 C4-C,0, C12, C16 C4-C7, Clo C4, C5, C8, C12, C18 C4-Cs, C,2, C,6
-20.63 -20.14 -20.46 -20.83 -20.23
+ 0.05 + 0.04 + 0.14 + 0.05 _+0.08
-43.20 -194.73 -7.03 -71.34 -26.78
+ 0.53 + 0.40 + 0.95 + 0.59 + 0.70
0.71 0.46 0.65 0.63 0.79
In the least squares analyses of eq. 3, the individual enthalpies were weighted inversely as the squares of the experimental uncertainty intervals. In all cases, r 2 > 0.9999. The s t a n d a r d errors were generated from the unweighted enthalpies, nr is the n u m b e r of carbon atoms in the compound. A composite, or universal, methylene increment is useful for estimating enthalpies of formation for compounds not belonging to an evaluated series. For the purpose of interpolating the enthalpy of formation of a member of an evaluated series, the unique methylene increment for that series should be used. We have done this on numerous occasions to estimate a value for a presumably incorrect experimental enthalpy of formation. Extrapolation is another matter. Calculated values of the slope and intercept in eq. 3 are quite sensitive to the input data and therefore the enthalpies from even a moderately long extrapolation are affected by the choice of data. We wonder whether, once nc becomes large enough, the influence of the substituent Z wanes and the methylene increments for various homologous series become truly Universal and identical to the n-alkane increment. Ref. 8 showed that at least up to nc = 16, the influence of the hydroxyl group in n-ROH on the slope of eq. 3 is not diminished. The success of empirical additivity schemes [11, 12] is based on the recognition t h a t the contributions to the enthalpy depend not only on the atoms and bonds present but also on their particular grouping, for example as CH3, CH2, or CH. Thus, popular and successful estimation schemes use p a r a m e t e r s which are proportional to the number of groups of atoms in the molecule, implicitly or explicitly using for all series a methylene increment derived from alkanes. The enthalpy changes resulting from the presence of heteroatoms requires appropriately derived bond energy terms. If there are any real differences in the
307 methylene increment from various series, they are subsumed in this term. Additional terms correct for steric interactions, for example. Recently, a series of articles [13] has described methods for accurately (< 4.184 k J mo1-1) reproducing enthalpies of formation from ab initio MO and statistical mechanical energies together with empirically-derived bond and group equivalents. Equation 4 shows the relationship between the enthalpy of formation and the electronic, thermal, and structural energy components devised by the authors. hH~ - H F E
+ TOR + POP + 4RT + ZBE + ZGE
(4)
HFE is the Hartree-Fock energy calculated at the HF/6-31G*//HF/6-31G* level of theory; TOR is the energy stored in low-lying vibrational/rotational states; POP is the energy associated with higher-energy conformations; 4RT is the sum of the translational and rotational freedom of the non-linear molecule (6/2 RT) and the PV equivalent for converting energy to enthalpy (1 RT); and BE and GE are the bond and group equivalents, respectively. The assigned value of TOR increases by 0.00067 hartree per carbon (1 hartree = 2625.46 k J moll), beginning with ethane or methanol (TOR = -0.00067). POP is typically evaluated by molecular mechanics calculations [12]. Unlike TOR, POP becomes linear with m at different values of m depending on the homologous series. For alkanes, the constant POP increment (0.00056 hartree per carbon) begins at nc = 4 (n- butane) and for alcohols at nr = 3 (n-propanol). Thus, both POP and TOR are proportional to larger n~ in both series. Calculated electronic energies are available for alkanes [13d, 14] and alcohols [13b] n~ = 1-6. A linear regression of HFE (hartree) vs. no produces for n-alkanes (no = 4-6) HFE (n-RH) + .000061 - [(-39.034695 + 0.000043) 9m] -(1.159595 + .000219) and for n-alkanols (m = 3-6) HFE (n-ROH) + .000061 = [(-39.034686 + .000027) 9m] -(76.00640 + .000127) In both cases, r 2 = 1.0 with variation in the correlation coefficient in the twelfth decimal place. As with the experimental enthalpies of formation, m e t h a n e and methanol deviate from linearity. Within the uncertainty limits of the regression, the electronic methylene increments are identical for the n-alkanes and nalkanols. If the analysis is performed for nc = 2-6 for both homologous series, the slopes are not statistically identical. The sum of the above energy components and the constant 4RT (0.00382 hartree) must linearly correlate with the number of carbon atoms in the compound for n-alkanes (m > 4) and for n-alkanols (no > 3) and again, the slopes must be identical:
308 ZE - ( H F E + TOR + POP + 4RT)= ((x', nr
13'
(5)
The linear regression results are, for alkanes n~ - 4-6 ZE (n-RH) + .000061 = [(-39.033465 + .000043) 9m ] - (1.159545 + .000219) and for alcohols nr = 3-6 ZE (n-ROH) + .000057 = [(-39.033459 + .000026) 9m] - (76.004902 + .000120) To the extent that errors in the statistical mechanical assumptions are negligible and these computed energies accurately reflect nature, it appears the experimentally different enthalpy-of-formation methylene increments for nalkanes and n-alkanols are not due to their electronic or thermal energy components. The bond and group energy contributions to equation 4 can only be calculated empirically, by subtracting ZE from AH~ In Ref. 13d, the hydrocarbon BE and GE parameters were determined from a least squares fit of a wide variety of structurally different alkanes. Because those terms are r e d u n d a n t for an homologous series, we define the n-alkyl series term (ST) according to equation 6 AH~
ZE + ST
(6)
Because both AH~ and ZE are linear with respect to m, so also is ST. The results from the weighted least squares analysis [15] are for alkanes (nc - 4-6) ST (n-RH) + .000232 - [(39.025552 + .000164) 9ne] + (1.1433195 + .000832) and for alcohols (m = 3-6) ST (n-ROH) + .000182 = [(39.025781 + .000081) 9no] + (75.930875 + .000377) We might have thought, because the experimental AH~ methylene increment slopes are different for the two series while the ZE methylene increment slopes are identical, t h a t the methylene increment slopes calculated from ST would be non-identical. The slopes from the linear regressions of ST vs. nc for the two series are just within each other's standard errors. This statistical sameness of all slopes [16] is an artifact because, as it must, combining the ZE and ST methylene increments for each homologous series reproduces the non-identical slopes from equation 3 for the compounds of relevant no. From the slope/intercept constants generated above and eq. 6, the calculated enthalpy of formation (hartree) for an alkane with n~ = 4-6 is
309 AH~ = [(-.007913 + .000170) ,, m] - (.016226 + .000860) and for an alcohol with n~ = 3-6 is AH~ - [(-.007678 + .000085) 9x] - (.074027 + .000396) The s t a n d a r d errors reported for the constants in the two equations above were calculated as the root mean square of the component s t a n d a r d errors and this propagation of uncertainty in the combined equations causes the slopes to barely overlap, whereas there is no overlap of the slopes calculated from eq. 3. We cannot hope to use these equations to calculate accurate enthalpies of formation for alkanes or alcohols with nc > 6 for the same reason we cannot use eq. 3 and the same restricted data range: the calculated results do not reproduce the experimental results due to the sensitivity of the calculation to the slope/intercept. We said earlier t h a t a and ~ from equation 3 are affected by the enthalpy-offormation data used to generate these constants. Historically, the values for various homologous series changed only slightly as newer and additional experimental data were amassed. But subjective, albeit experienced, judgment is exercised when authors choose the "best" enthalpy [11] or the "selected" values [3] to recommend in a compilation. Other than the enthalpy values themselves, the uncertainty intervals also affect the regression constants in the weighted least squares analyses. Calculations [7, 1 lb, 17] show the precision of measuring heat evolution in combustion calorimetry may be about +0.13 k J mo1-1 per C atom. Using the experimental enthalpies of formation [P] for those alcohols and alkanes discussed above, we can produce statistically identical slopes for these series by changing only slightly (both raising and lowering < 0.3 k J tool-1) four of the seven uncertainty intervals. The slopes still seem uncomfortably non-parallel but at least we recognize that we are not considering invariant values. We note t h a t the slope from the least squares analysis of the HFE's of straight-chain aldehydes [7cc] (m = 4-6) is not identical to that from either the n-alkanes or the n-alkanols, primarily due to the extremely small statistical s t a n d a r d error (~10 la) associated with the profound linear fit of the aldehydes. H F E (n-RCHO) - (-39.034740 9no)- 74.847980 We have come full circle. Until there is a method whereby empirical constants are eliminated from our calculations, or until enough experimental data are accumulated, we can only speculate about the convergence of methylene increments from different homologous series.
310 4. H O M O L O G O U S SERIES: CYCLOALKANES Consider now a different but formally homologous series composed solely of the repeating methylene groups: CH2, (CH2)2, (CH2)3, (CH2)4. . . . (CH2),. In the current context, we take species with n>2 to mean the unsubstituted cycloalkanes and not alkane-l,o)-diyls or -diradicals. It is quite clear that the series cannot commence with the triatomic molecule CH2 since neither singlet nor triplet methylene contains tetracoordinate carbon atoms. Whether or not the alternate beginning entry, CH2CH2, qualifies as a member is more contentious. While "cycloethane" is essentially never used as a synonym for ethylene [18] and certainly lacks tetracoordinate carbon, there is a certain, now largely historic appeal [19] for describing C2H4 and other olefins in terms of bent or banana bonds instead of using a and ~ orbitals [20]. Olefinic properties of the third entry, (CH2)3, are quite pronounced [21] and, encouraged by a/u descriptions of cyclopropane [19b, 22], suggest the possibility of a "continuum" of molecular properties with increasing n beginning with n > 2. We may consider the numerical difference between the gas phase enthalpies of formation of a given (CHg), and that of n strainless CH2 groups (the methylene increment derived from the n-alkanes) as the strain energy~ SE, of the cycloalkanes, (CH2),. However, unlike the previous series, it is well-recognized that there is no constant increment with increasing n [23]. Nonetheless, the energetics of cycloalkanes are still relatively well-understood and the qualitative features of the variation of their strain energies as a function of ring size is a component of the customary education of contemporary organic chemists. The strain energies of the cycloalkanes C9-C~7 are: 93.7, 115.1, 110.8, 26.6, 0, 26.1, 40.4, 52.6, 51.7, 47.2, 17.0, 21.4, 12.4, 7.6, 7.9, 14.1 kJ mol 1. The near, but fortuitous, equality of the strain energy of cyclopropane and cyclobutane (the archetypical carbocyclic three- and four-membered ring compounds, a.k.a. 3MR and 4MR) has entertained and educated numerous theoretical and physical organic chemists [24]. The occasional complication for at least moderate sized n has been rectified: we recall the earlier, and now corrected, discrepancy for cyclotetradecane that arose because of a faulty measurement of the enthalpy of sublimation of this n - 14 species [25]. We reiterate that had bond energies been strictly additive, strain energies would vanish because the total bond energy for the cycloalkane with a given value of n would precisely equal n(D(C-C)) + 2n(D(C-H)), and the enthalpies of formation of (CHg), would vary linearly with the parameter n. 5. H O M O L O G O U S SERIES:
SATURATED POLYCYCLIC HYDROCARBONS Saturated, polycyclic (alicyclic) hydrocarbons are composed of three tetracoordinate building blocks: quaternary >C<, tertiary >CH- and secondary -CH2-. As we address the question of homologies for these species, we must accept that there are idiosyncrasies arising from the varying sizes of the component
311 rings. This is unavoidable because of the above enunciated idiosyncrasies for the related single rings species, the cycloalkanes. Bond energy additivity is unequivocally violated as the total bond energy does not depend solely on the number of hydrogens and the number of carbons. It is clear that there is a reasonably large range of enthalpies of formation, and hence of total bond energies, for any stoichiometry C,Hb we choose [26]. For example, consider the series cyclohexane {1}, trans-decalin {2}, and transsyn-trans-perhydroanthracene {3}: (CH2)6(CH)o, (CH2)8(CH)2, (CH2)lo(CH)4 with their increasing numbers of secondary and tertiary carbons and 6-membered rings (6MR). The members of this series differ by a constant [(CH)2(CH2)2] structural unit.
{1}
{2}
{3}
Since their respective enthalpies of formation (-123.4 • 0.8, -182.1 • 2.3 and 243.2 • 3.8 k J mol") linearly correlate (r 2 = 0.9999) with the number of such structural units, it follows that the reaction (CH2)6 + (CH2),o(CH)4 -+ 2 (CH2)8(CH)2
(7)
is thermoneutral within the reported error bars. Based on the regression constants (slope = -59.4 • 0.7 kJ mol-', intercept = -123.4 • 1.5) the enthlapy of formation for trans-syn-trans-syn-trans-perhydronaphthacene is predicted to be -302.7 • 1.0 k J mol". Another series of 6-membered ring species is cyclohexane, adamantane {4}, diamantane {5}: (CH)o(CH2)6, (CH)4(CH2)6, (CH)8(CH2)6 with their increasing numbers of tertiary carbons and 6-membered rings. Note that this series is not homologous in the sense we have used previously, in that the [(CH)4] increase between members of the series occurs in a disconnected manner.
{4}
{5}
Their enthalpies of formation (-123.4 • 0.8, -134.6 + 2.3, -145.9 + 2.7 k J mol") also show a linear dependence on the increasing number of >CH- groups (slope = -2.8 + 0.1, intercept = -123.4 • 0.1, r 2 = 0.99999). Equivalently, the reaction
312 (CH~)6 + (CH~)6(CH)8 -+ 2 (CH2)6(CH)4
(8)
is thermoneutral within the reported error bars. The above enthalpies of formation allow one to estimate the enthalpies of formation of both -CH2- and >CH- groups in these polycyclic hydrocarbons. Perhaps the simplest is to define the former enthalpy as 1/6 of the enthalpy of formation of cyclohexane. The result, -20.6 kJ mol-', is very similar to the nalkane methylene increment. Accepting this value and assuming additivity of increments in the first series, the enthalpy of formation for the >CH- group is found to be -9.3 k J mol". This value is much more negative (that is, stabilizing) than that derived for the second series, -2.8 kJ mol-'. In the latter series, each additional >CH- group results in formation of an axial substitutent bond to a cyclohexane ring with its concomitant gauche interactions; in the former series each additional >CH- group results in an equatorial substituent bond with no additional gauche interactions. The difference, 6.5 k J mol-', represents the gauche interactions and whatever additional strain accompanies the formation of the rigid adamantane and diamantane cages. In comparison, the 12.9 k J mol-' enthalpy difference between cis- and trans-decalin represents three additional gauche interactions between the rings. 6. T E T R A H E D R A N E AND [ I . I . I I P R O P E L L A N E Let us now consider tetrahedrane {6} and [1.1.1]propellane {7}. Both species have numerous 3MRs or cyclopropane rings. We thus expect both compounds to be highly strained. Do these hydrocarbons belong to any homologous series? Are they to be understood as avatars for the study of yet more wondrous strained species? Are they, as it were, alicycles resplendent in glorious isolation despite the considerable attention paid to both tetrahedrane [27] and [1.1.1]propellane [28] by computationally oriented theoretical chemists?
{6}
{7}
In what follows, we will estimate the enthalpies of formation of tetrahedrane and of [1.1.1]propellane by assuming, perhaps disingenuously, that these species are "normal" and they belong to various "homologous" series that contain one or the other of these two hydrocarbons. Then, using the enthalpies of formation for the earlier members of these series, we will "predict" the desired gas phase enthalpies of formation. Although we remember the often given admonition that it is easier to interpolate than to extrapolate, we will nonetheless look at several
313 different series to derive the desired numbers. Agreement between the methods would suggest normalcy for the tetrahedrane and the propellane while significant disagreement would suggest that these species are unique. 6.1 T e t r a h e d r a n e
Tetrahedrane is certainly to be found amongst any collection of consummately strained polycyclic hydrocarbons. While the "2-dimensional" cyclopropane is composed of three carbons and a single strained 3-membered ring, the "3dimensional" tetrahedrane is made of four carbons and a number of strained cyclopropanes as faces. As befits relative availability and seemingly stability as well, there is yet only one thermochemically characterized tetrahedrane, the tetra-t-butyl derivative [29], for which both condensed and gas phase enthalpies of formation are available [30]. The first approach to tetrahedrane starts with cyclopropane and bicyclobutane with their enthalpies of formation of 53.3 + 0.6 and 217.1 + 0.8 kJ mol 1. However, tetrahedrane, is not the next compound. Cyclopropane has three carbons, all of which are secondary, and together compose one 3-membered ring (3MR). Bicyclobutane has four carbons, two of which are secondary and two of which are tertiary, all together composing two 3-membered rings. What is the next compound if not tetrahedrane? Consider the individual counts of the number of secondary carbons, tertiary carbons, and 3MRs in cyclopropane and bicyclobutane. We have three and two secondary carbons; linearity says the next species would have but one. We have no and two tertiary carbons; the next species would have four. We have one and two 3MRs; the next species would have three. In contrast, tetrahedrane has four tertiary carbons and no secondary carbons that compose its three 3MRs. We identify the one secondary carbon in the hypothetical [(CH2)(CH)4] as the above "strainless methylene increment" and its enthalpy of formation as -20.6 kJ mo1-1 and the other four tertiary carbons compose tetrahedrane. If these compounds form an homologous series, then the enthalpy-of-formation difference between adjacent members should be equal, that is AHf[(CH2)2(CH)2]- AHf[(CH2)3] = {AHf[(CH)4] + (-20.6)} - AHf [(CH2)2(CH)2]
(9)
From this approach, the predicted value of the enthalpy of formation of gaseous tetrahedrane is ca. 402 kJ mo1-1. At the risk of ignoring the consequence of any vestige of homoaromaticity, as well as any other interactions among double bonds and/or 3MRs, the second putative homologous series proceeds from tetrahedrane through benzvalene {8}, semibullvalene {9}, and triquinacene {10}; from a count of no, one, two, and three cyclopentenes with their defining double bonds; and from four, six, eight, and ten carbons. In each of the four cases, there is a central tertiary >CH- affixed to three other tertiary >CH- groups. These latter >CH- groups are joined by no, one, two,
314 and three -CH=CH- finks and by three, two, one, and no direct >CH-CH< bonds, respectively. ..
{8}
{9}
{10}
The (non-combustion calorimetrically) experimentally determined enthalpies of formation for benzvalene [31], semibullvalene [32], and triquinacene [33] are 363, 308, and 224 k J mo1-1 respectively. Assuming linear dependence of enthalpy of formation on any of these "counts" (r 2 = 0.98), we derive a value for the enthalpy of formation of tetrahedrane of 437 kJ mol 1. The equivalent nonstatistical derivation is to assume thermoneutrality for the reaction (CH)8 + (CH)6 --> (CH)lo + (CH)4
(10)
i.e., semibullvalene + benzvalene ~ triquinacene + tetrahedrane, which corresponds to an enthalpy of formation of tetrahedrane of 447 kJ mo1-1. Yet another route to tetrahedrane proceeds through prismane {11} and cubane {12}. Homology is documented in carbon counts: tetrahedrane, prismane and cubane have four, six, and eight tertiary carbons. Recalling that the strain energies of 3- and 4- membered rings are nearly the same, homology is also implicitly asserted by the number of these "microrings". Tetrahedrane has three 3MRs, the tetracyclic prismane has three 4MRs and one 3MR (or alternatively, and quite enthalpically indistinguishable, two apiece 4MRs and 3MRs), and the pentacyclic cubane has five 4-membered rings (4MRs): the number of microrings increases linearly from three to four to five.
{ll}
{12}
The necessary enthalpy of formation of cubane [34] is 622 k J tool 1, and that of prismane [35] is taken as ca. 570 kJ mo1-1. Linear extrapolation gives the enthalpy of formation of tetrahedrane as ca. 518 kJ mol 1. Alternatively, in the absence of a measured enthalpy of formation data for prismane, we assert thermoneutrality for the formal reaction
315 (CH)8 + SE[(CH2),] -+ 2 (CH)4
(11)
(where SE is the strain energy when n is either 3 or 4) and thus derive the enthalpy of formation of tetrahedrane as ca. 370 k J mol 1. The final approach discussed here for estimating the enthalpy of formation of tetrahedrane is to assume the reaction 2 (CH)2(CH2)2 -~ (CH)4 + (CH2)4
(12)
is thermoneutral. The use of this approach was encouraged, in part, from the name bicyclobutane for the species on the left and tetrahedrane (tricyclobutane) and cyclobutane (monocyclobutane) for the two species on the right. In other words, bond energy additivity is supplemented by equality for the ring count, 2 + 2 = 3 + 1. From these assumptions, an enthalpy of formation for tetrahedrane of 2(217.1) - 28.4 = 406 k J mol 1 is deduced. Consider the values we have "derived" for tetrahedrane: 402, 437, 447, 520, 370 and 406 k J mo1-1. The mean value is 430 + 51 k J mol 1. The large standard deviation for the five values suggests that the thermochemistry of tetrahedrane is incompatible with simple modeling and linear extrapolation from homologous series. This conclusion is perhaps not a surprise, but it is nonetheless a disappointment. Our values are also much lower than the internally more consonant values of 561, 554 + 23 and 535 + 4 k J mol 1 derived from the analysis of high level quantum chemical calculations [36]. We thus suggest, admittedly a posteriori, the presence of superstrain [37], i.e. the presence of strain or additional destabilization beyond that of the component rings. The inclusion of the strain energy for a fourth 3MR, a ca. 115 k J tool 1 correction (the strain energy of cyclopropane), results in an amended value of 545 + 51 k J mo1-1. After all, it is the acute 60 ~ C-C-C angle, of which there are 12 (or four 3MRs) in tetradedrane, that is the defining feature of cyclopropane, tetrahedrane and other species with 3MRs. Nonetheless, we still seem to estimate on the low side for the enthalpy of formation for tetrahedrane. It would appear that tetrahedrane stands alone: this species may be considered a hydrocarbon hermit. 6.2 [1.1.1] P r o p e l l a n e
[1.1.1]Propellane is exquisitely strained with its three 3-membered rings as well as two carbons having the intriguing structural feature of an "inverted tetrahedron" [38]. However, unlike tetrahedrane, [1.1.1]propellane seemingly lacked the "need" of bulky (or even any) substituents [39] in order to be successfully isolated [40] and thermochemically characterized by condensed and gas phase enthalpies of formation [41]. We will attempt to derive its enthalpy of formation before comparing it with its experimental value. The first conceptual approach assumes that the reaction enthalpy of its formal synthesis from cyclopropane and 2,2,3,3-tetramethylbutane, equation 13,
316 is the same as that of the formal synthesis of bicyclobutane from cyclopropane and 2,3-dimethylbutane, equation 14. 3 (CH2)3 + (CH3)3C-C(CH3)3 ---) (C)2(CH2)3 + 3 CH3(CH2)2CH3
(13)
2 (CH2)3 + (CH3)2CH-CH(CH3)2 --> (CH)2(CH2)2 + 2 CHa(CH2)2CH3
(14)
Reaction 14 is endothermic by 37 kJ mol 1, and provides additional documentation for the earlier enunciated superstrain [37] for bicyclobutane. Accepting this value as a correction term for equation 13 gives our first estimate of the enthalpy of formation of gaseous [1.1.1]propeUane, namely 349 kJ mol". We may recognize reactions 13 and 14 as part of a homologous series in which the next and last reaction is (CH2)3 + CH3CH2-CH2CH3 --~ (CH2)2(CH2)~ + 1 CH3(CH2)2CH3
(15)
This "non-reaction" reaction is precisely thermoneutral. Linear extrapolation from equations 14 and 15 suggests that equation 13 should be endothermic by 2.37 kJ mol" and so the enthalpy of formation of [1.1.1]propellane would thus be expected to equal 386 kJ mol". The next conceptual approach for the enthalpy of formation of [1.1.1]propellane assumes that the enthalpy of the formal propellane synthesis from cyclopropane and neopentane, equation 16, is the same as that of the formal synthesis of bicyclobutane from cyclopropane and isobutane, equation 17. 3 (CH2)3 + 2 C(CH3)4 --> (C)2(CH2)3 + 2 CH3CH2CH3 + 2 CH3(CH2)2CH3
(16)
2 (CH2)3 + 2 CH(CH3)3 --> (CH)2(CH2)2+ 2 CH3CH2CH3 + 1 CH3(CH2)2CH3
(17)
Equation 17 is endothermic by 44 kJ mol", and is consonant with the earlier 37 kJ mo1-1 attributed as the superstrain in bicyclobutane. Accepting this value as a correction for equation 16 gives our next estimate for the enthalpy of formation of gaseous [1.1.1]propellane, 326 kJ mol 1. We recognize equations 16 and 17 as part of a homologous series involving two and one molecule of n-butane as an additional product. The remaining reaction involves zero molecules of n-butane as a product: (CH2)3 + 2 CH2(CH3)2 --> (CH2)2CH2+2 CH3CH2CH3 + 0 CHa(CH2)2CH3
(18)
This admittedly "nonreaction" reaction is, of course, precisely thermoneutral. Linear extrapolation suggests that reaction 16 should be endothermic by twice 44 kJ mo1-1 and so the enthalpy of formation of [1.1.1]propellane should be 370 kJ mol-i. Our last conceptual approach assumes the reaction enthalpy of the formal [1.1.1]propellane synthesis from 1,1-dimethylcycloprop ane and two
317 cyclopropanes, equation 19, is the same as that of the formal synthesis of bicyclobutane from methylcyclopropane and cyclopropane, equation 20 2 (CH2)2C(CH3)2 + 1 (CH2)3 ~ (C)2(CH2)3 + 2 CH3(CH2)2CH~
(19)
2 (CH2)2CHCH3 + 0 (CH2)3 -~ (CH)2(CH2)2 + 1 CH3(CH2)2CH3
(20)
Equation 20 is endothermic by 43 k J mo1-1, and by being nearly identical to the earlier 37 and 44 kJ mo1-1 for other reactions involving bicyclobutane, provides additional documentation of the superstrain of bicyclobutane. Accepting this value for the second reaction gives our penultimate estimate of the enthalpy of formation of gaseous [1.1.1]propellane of 331 kJ mol -~. We may recognize equations 19 and 20 as part of a homologous series: the remaining reaction is 2 (CH2)2CH2 +-1.(CH2)3 ~
1 (CH2)2(CH2)I + 0 CHa(CH2)2CH3
(21)
This "nonreaction" reaction is precisely thermoneutral. Linear extrapolation suggests that reaction 19 should be endothermic by twice 44 k J mo1-1 and so the enthalpy of formation of [1.1.1]propellane should be 375 k J mol 1. The predicted values for the enthalpy of formation of gaseous [1.1.1]propellane are 349, 386, 326, 370, 331 and 375 kJ mo1-1 , resulting in an average value of 356 • 25 k J mo1-1. These values conceptually "divide" into two categories: 349, 326, 331; and 386, 370, 375, depending on whether the superstrain of bicyclobutane is included once or twice. The measured value of the enthalpy of formation of [1.1.1]propellane [41] is 355 kJ mol 1, suggesting that the superstrain be included some 1.5 or 3/2 times. Maybe this last noninteger multiple is sensible: bicyclobutane has two -CH2- groups affixed to the central CC spoke, while [1.1.1]propellane has three such -CH2- groups. However, given the additionally destabilizing feature of two so strongly inverted tetrahedral carbons in [1.1.1]propellane, we would have thought this species be considerably more strained than it is. Our success suggests [1.1.1]propellane is quite sensible, if not also normal, and is a hydrocarbon harbinger for further understanding of strained species. Yet [1.1.1]propellane has fooled us before [42] and so we are chastened, cautious, and remain curious about this multi-ring hydrocarbon.
REFERENCES 1. 2.
3.
M. Pomerantz and J. F. Liebman, Tetrahedron Lett. (1975) 2385. H. Basch and T. Hoz, in The Chemistry of Organic Germanium, Tin and Lead Compounds (ed. S. Patai), Wiley, Chichester, 1995, and references cited therein. Unless otherwise cited, all enthalpies of formation are from the compendium J. B. Pedley, R. D. Naylor and S. P. Kirby, Thermochemical Data of Organic Compounds (2nd Ed.) Chapman & Hall, New York, 1986.
318
.
.
.
~
9.
10. 11.
12.
13.
14.
S. W. Slayden and J. F. Liebman, in Supplement A3: The chemistry of the double-bonded functional groups, (Ed. S. Patai), Wiley, Chichester, 1997. a) R. L. Montgomery and F. D. Rossini, J. Chem. Thermodynamics, 10 (1978) 471; b)J. F. Liebman, J. A. Martinho SimSes, and S. W. Slayden, Structural Chemistry, 6 (1995) 65. E. J. Prosen, W. H. Johnson and F. D. Rossini, J. Res. Natl. Bur. Stand., 37 (1946) 51. J. D. Cox and G. Pilcher, Thermochemistry of Organic and Organometallic Compounds, Academic Press, London and New York, 1970. P. Sellers, G. Stridh, and S. Sunner, J. Chem. Eng. Data, 23 (1978) 250. a) J. F. Liebman and S. W. Slayden, in Molecular Structure Research (Eds. M. Hargittai and I. Hargittai), JAI Press, Greenwich, CT, in press; b) S. W. Slayden, J. F. Liebman, and W. G. Mallard, in The chemistry of functional groups, Supplement D: The chemistry of halides, pseudo-halides and azides Vol. 2, (Eds. S. Patai and Z. Rappoport), Wiley, Chichester, 1995; c) J. F. Liebman, K. S. K. Crawford, and S. W. Slayden, in The chemistry of functional groups, Supplement S: The chemistry of sulphur-containing functional groups, (Eds. S. Patai and Z. Rappoport), Wiley, Chichester, 1993; d) S. W. Slayden and J. F. Liebman, in The chemistry of functional groups, Supplement E: The chemistry of hydroxyl, ether and peroxide groups Vol. 2, (Ed. S. Patai), Wiley, Chichester, 1993. See, for example J. F. Liebman, M. S. Campbell, and S. W. Slayden, in Supplement F2: The chemistry of amino, nitroso, nitro and related compounds, (Ed. S. Patai), Wiley, Chichester, 1996. a) Cox and Pilcher (ref. 7) reviewed the various modern bond energy schemes and assess their strengths, weaknesses, and equivalencies. Some of the schemes have been updated since then. b) See, for example, N. Cohen and S. W. Benson in The Chemistry of alkanes and cycloalkanes, (Eds. S. Patai and Z. Rappoport), Wiley, Chichester, 1992. c) An extension of the group method to include substructures is detailed in ref. 3. Enthalpies of formation may be calculated with parameters derived from molecular mechanics force fields. See U. Burkert and N. L. Allinger, Molecular Mechanics, American Chemical Society, Washington, D. C., 1982; N. L. AUinger, Y. H. Yuh, and J-H. Lii, J. Amer. Chem. Soc., 111 (1989) 8551; N. L. Allinger, X-F. Zhou, and J. Bargsma, Theochem-J. Molec. Structure, 118 (1994) 69. a) L. R. Schmitz and Y. R. Chen, J. Comput. Chem., 15 (1994) 1437 and references therein; b) N. L. AUinger, L. R. Schmitz, I. Motoc, C. Bender, and J. K. Labanowski, J. Am. Chem. Soc., 114 (1992) 2880; c) L. R. Schmitz, I. Motoc, C. Bender, J. K. Labanowski, and N. L. Allinger, J. Phys. Org. Chem., 5 (1992) 225; d) N. L. Allinger, L. R. Schmitz, I. Motoc, C. Bender, and J. K. Labanowski, J. Phys. Org. Chem., 3 (1990) 732; e) K. B. Wiberg, J. Org. Chem., 50 (1985) 5285. W. C. Herndon, Chem. Phys. Letts, 234 (1995) 82.
319 15. In the least squares analyses, the individual ST energies were weighted inversely as the squares of the experimental enthalpy-of-formation uncertainty intervals only. (The uncertainty intervals for the combined ZE are identical for each data point.) The standard errors in the regression equations were generated from the unweighted enthalpies of formation. 16. For comparison, 39.025936 hartree is the bond energy methylene increment found in Ref. 13d. It was derived from the alkanes and then transferred to functionalized alkanes. 17. D. R. Stull, E. F. Westrum, and G. C. Sinke, The Chemical Thermodynamics of Organic Compounds, Krieger, FLA., 1987. 18. See, for example, A. Greenberg and J. F. Liebman, Strained Organic Molecules, Academic Press, New York, 1978, pp. 43-44. 19. a) L. Pauling, J. Am. Chem. Soc., 53 (1931) 1347; b) C. A. Coulson and E. T. Stewart. in The Chemistry of the Alkenes (ed. S. Patai), Wiley, New York, 1964. 20. See, for example, R. P. Messmer and P. A. Schultz, Phys. Rev. Lett., 57 (1986), 2653. 21. See, for example, J. B. Conant, pp. 1-42, and D. Cremer, R. F. Childs and E. Kraka, pp. 339-410, in The Chemistry of the Cyclopropyl Group Vol. 2, (ed. Z. Rappoport), Wiley, Chichester, 1995. 22. D. Cremer, E. Kraka and K. J. Szabo, in The Chemistry of the Cyclopropyl Group Vol. 2, (ed. Z. Rappoport), Wiley, Chichester, 1995. 23. E. L. Eliel and J. J. Engelsman, J. Chem. Educ., 73 (1996), 1203. 24. See, for example, K. Wiberg, in The Chemistry of the Cyclopropyl Group Vol. 1 Part 1, (ed. Z. Rappoport), Wiley, Chichester, 1995 and J. F. Liebman, in The Chemistry of the Cyclopropyl Group Vol. 2, (ed. Z. Rappoport), Wiley, Chichester, 1995. 25. J. S. Chickos, D. G. Hesse, S. Y. Panshin, D. W. Rogers, M. Saunders, P. M. Uffer and J. F. Liebman, J. Org. Chem., 57 (1992), 1897. Also see ref. 23 for a salutary note. 26. Godleski, P. v. R. Schleyer, E. Osawa, and T. Wipke, Progr. Phys. Org., Chem., 13 (1983), 63. 27. M. N. Gloukhovtsev, S. Laiter and A. Pross, J. Phys. Chem., 99 (1995) 6828 and numerous references cited therein. 28. See T. Kar and K. Jug, Chem. Phys. Lett., 256 (1996) 201 and numerous references cited therein. 29. G. Maier, Angew. Chem. Intl. Ed., 27 (1988) 309. 30. Unpublished measurements of the enthalpy of combustion by M. Mhnsson, and of the enthalpy of sublimation by C. Riichardt, H.-D. Beckhaus and B. Dogan, cited in ref. 29. 31. N. J. Turro, C. A. Renner, T. J. Katz, K. B. Wiberg and H. A. Connon, Tetrahedron Lett., 46 (1976) 4133. 32. K. Hassenriick, H.-D. Martin and R. Walsh, Chem. Rev., 89 (1989) 1125. 33. J. F. Liebman, L. A. Paquette, J. L. Peterson and D. W. Rogers, J. Am. Chem. Soc., 108 (1986) 8267.
320 34. We accepted the value recorded in ref. 3 as measured by B. D. Kybett, S. Carroll, P. Natalis, D. W. Bonnell, J. L. Margrave and J. L. Franklin, J. Am. Chem. Soc., 88 (1966) 626 although we acknowledge the alternative, group increment-assisted experimental value of D. R. Kirklin, K. L. Churney and E. S. Domalski, J. Chem. Thermodyn., 21 (1989) 1105 that suggests the former value is some 40 kJ mo1-1 too low. 35. D. W. Rogers, F. J. McLafferty, W. Fang and Y. Qi, Struct. Chem., 4 (1993) 161. 36. Quantum chemical calculations: a) with derived group equivalents: K. B. Wiberg, J. Comput. Chem., 5 (1984) 197; b) with BAC-MP4 corrections: Carl F. Melius, personal communication to the authors; c) at the generally chemically accurate G2 level: Gloukhovtsev, et al., op. cit., ref. 27. 37. J. F. Liebman and A. Greenberg, Chem. Rev. 76 (1976) 311. 38. K. B. Wiberg, G. J. Burgmaier, K. W. Shen, S. J. LaPlaca, W. C. Hamilton and M. D. Newton, J. Am. Chem. Soc., 94 (1972) 7402. 39. P. Kaszynski and J. Michl, in The Chemistry of the Cyclopropyl Group Vol. 2, (ed. Z. Rappoport), Wiley, Chichester, 1995. 40. K. B. Wiberg and F. H. Walker, J. Am. Chem. Soc., 104 (1982) 5239; K. Semmler, G. Sziemies and J. Belzner, J. Am. Chem. Soc., 107 (1985) 6410. 41. K. B. Wiberg, W. P. Dailey, F. H. Walker, S. T. WaddeU, L. S.Crocker and M. Newton, J. Am. Chem. Soc., 107 (1985) 7247. 42. Liebman and Greenberg, op. cir., pp. 344- 7.
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
321
Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Stabilization and Destabilization Energies of Distorted Amides Arthur Greenberg' and David T. Moore ~b Y}epartment of Chemistry, University of North Carolina at Charlotte, Charlotte, NC 28223, USA bPresent address: Department of Chemistry, University of North Carolina at Chapel ~ North Carolina 27599-6293, USA
Chapel
1. INTRODUCHON 1.1. Chemical Implications of Strained Amides and Lactams
The amide functional group plays a fundamental role in organic chemistry, biochemistry, medicinal and polymer chemistry [ 1]. Despite the great wealth of information, only recently has there grown a body of_d.~t_aconcerning the structures and energies, and their consequences for i,eaofivity and biological activity, of distorted amide linkages [2]. That such distortion introduces unug~ reactivity has long been recognized [2], for example, in the enhanced reactivity of beta-lactams (2azetidinones, 1) [3] and the great difficulty in isolating all but a few dpha-lactams (aziridinones, 2) [2]. In 1938, Imkes [5] recognized the relevance of Bredt's rule and predicted that small bridgehead bicyclic lactams ("BBLs") slg~d be very reactive. This is consistent with the subsequent lack of success in isolating (unsubstituted) 2-quinuclidone (1-azabicyclo[2.2.2]octan-2-one, 3) [6]. O
N17 1
~ 2
. 3
Dtaing the 1960s Hall attempted to apply this reactivity using highly polymerizable bridgehead lactams [7]. In addition to 1-3, other compounds with distorted amide linkages include stericallyhindered species [8], trans-caprylolactam [9], cyclic dipeptides [10], and twistane lactams [11]. The Bredt's rule analogy is very robust and derives from the structural similarities between the planar olefm linkage and isoelectro.nic, (ideally) planar amide linkage [2,7,12,13]. The high condensedphase rotational barrier (A G*) about the C-N bond (ca 20 kcal/mol; see later discussion pertaining
322 to AH*) [2] in amides is usually attributed to its partial double bond (4A4C), a l t l ~ interpretation has been challenged and modified [ 14,171
O9 -Nc; /
<
>
:O~/C-N(
<
this
>
\
4A
411
4C
l.Z. n i o ~ e ~ ImpUcaaous The first determination of an experimental heat ofeondmsfion of a strained amide or laetmn, that ofmethylpenidllin (5), was reported by KB. Woodward and NBS collaborators in 1949 [18]. It indicated a strain energy in the 4-membered ring some 6 keal/mol h i / ~ than in the model monoeyelic beta-laetmns (6). This added strain is, in part, resporm%le for the biological activity of the penicillins. Over twenty years later, Sweet and Dahl used X-ray data to correlate increased laetam distortion in beta-laetam anffoioties with enhanced biological activities [19].
r ;Ozlq 5
6
In the realm of protein chemistry, Ramadumdran [20] emphasized the need to abandon the assumption of absolute planarity in peptiOe linkages. It is recognized that binding with an enzyme suains and activates substrates [21 ]. For peptidases and proteases this involves distortion of the scissile peptide (amide) linkage and specific distortions may lead to specific stereochemical consequences [22]. Another i m ~ biolosic~ topic ~ to l~fide distoaion is the r ~ disc,ovary of peptidyl-prolyl cis-trans i s o ~ ("rotamase"), an enzyme which catalyzes isomedzafion of c e r t ~ cis- and trans-peptides [23-27]. It plays a crucial role in autoimmunosuppression. Experimental evidence suggests that the transition state has a highly distorted peptide linkage [23,24]. The class of proteins tenned "immunophilins" appears to play a role in protein folding at the peptidyl-proline linkage and this is considered to be the rato-detcrmining step [25-27]. Molecular modelling studies of the structure of the active site of rotamase or for design of rotamase inhibitors must inck~e amide distortion.
323 1.3. Effects of Distortion on AcM/Base Properties
Severe distortion of an amide linkage leads to important chemical consequences. Hydrolysis of the 2-quinuclidone 7 was formalto be 10s faster than in model unstrained amides [28]. In contrast to the exceedingly weak basicifies ofunstrain~ amides and lactams, 2-quinuclidones such as 7-10 [28-33] behave as aminoketones and are only slightly weaker than simple amides or ~ [2931]. Consistent with this behavior is the observation that 2-quinuclidones such as 10 are protonated on nitrogen (unstrained amides and lactams are O-protonated) and also alkylate on nitrogen to produce 10-CH3+ for example [32,33]. This raises the very interesting question: what are the degrees of distortion that mark the boundary between N- and O-protonation of amides? This fundamental question of bonding and energetics may have biological in~lications. The pH profiles for hydrolysis of 2-quinuclidones have been employed in studies which model proteolysis [34]. There are no published experinaental gas-phase proton at~ties for distorted amides and laetams.
7
8
9
10 10-CH3+ 11 Our fully-op"mnized 6-31 G* ab initio molecular orbital calculations support a strong preference for N-protonation of 3 while indicating that N- and O-protonation are competitive in 1-azabicyclo[3.3.1]nonan-2-one (11) [35-37]. More r~ent results on other systems will be described. 1.4. Defming Distortion of the Amide Linkage
Dunitz and grmlder [38] defined three independent distortion parameters for the amide linkage: ~ , p y r a m i ~ o n at nitrogen; ~c, the p y r a m i ~ o n at carbon; and x, torsion about the C-N bond. R e c o ~ that ~c is usually quite small they suggested that a plot of distortion energy versus ~ and x (Figure 1) would be valuable for conformational studies [38]. Brown employed a simpler two-parameter distortion model ignoring ~c [39]. Summaries of experimental and calculational distortion parameters in distorted lactams have been published [2,13,35,39,41,42]. There are also two interesting papers which employ the Cambridge Crystallographic Database to examine the interrelationships between amide distortion parameters [41] as well as between parma~ers in all amides [43].
324
A
/
t
9O ~0 6O
0
I0 20 30 40
50 60
~N
Figure 1. Schematic representation of plot of distortion energy of amides as a fin~on OfgN and x. 1.5. Larger Bridgehead Bicydir Lactams: Are They Hyperstable? The discussion to this point has f ~ on smaller species in which nitrogen is constrained to be highly pyramidal and the N-CO bond is highly twisted. Larger bicyclic ~ exemplified by the 3.3.3 (12) (BOLDFACE DENOTES THE BRIDGE CONTAINING THE CARBONYL GROUP) and 4.3.3 molecules (13 and 14), are as yet unknowrL We have somewhat atbiumily termed systems of 3.3.3 and higher (no bridges smaller than 3) "larger" because they allow nitrogen to be planar or very nearly so [44,45]. Indeed, the entire amide linkage in these systems can approach planarity [36]. Since the model amine has already "paid the price" for planarity (i.e. including the ca 6 kcal/mol N inversion barrier), the resonance energy (RE) eal~dmed according to equation 1 may be higher than for normal aeyclic amides or unstrained lactams. These larger bridgehead bicyclic lactams may thus be "hyperstable". This idea is directly related to the large b r i d g ~ bicyclic olefms which have been termed "hyperstable" since they have reduced strain
(1) RE(13) = AH~4.3.3 amine) + AH~4.3.3 ketone) - AI~4.3.3 alkane) - AI~4.3.3 lactam) relative to the corresponding alkane and, thus, manifest reduced enthalpies of hydrogenation [46,47]. For example, the 4.4.3 b f i d g ~ alkene (15) analogous to 16 is calculated to have a (negative) olefin strain (OS) of-16.8 kcal/mol which can be taken as a measure of its hyperstability
[47].
12
13
14
15
16
325 2. BACKGROUND 2.1. Energetics of Distorted Lactams
The goal of generating a plot of distortion energy versus geometric distortion parameters for amides based upon experimental and/or high-level calculational dat__a(e.g. Figure 1) has yet to be achieved. X-ray structures for bridgehead lactams 17-24 [ 13,40,48-50] and for some distorted acyclic amides [41] have been published. While there are numerous published X-ray structures for beta-lactams, 1,3-diadamantylderivative 25 [51] is the only such alpha-lactam. Ph
R~--~O 17
18
R
O~ 19
0
0
20
21
O
-Q 22
23
24
25
Energy data (heats of combustion, formation, hydrolysis) do not exist for any strained amidr or lactam except for combustion data for 5 and 6, as noted earlier, and heats of hydrolysis for "highenergy" amides such as 1-acetylimidazole[52] as well as amides and dipeptides susceptibleto enzyme-catalyzed hydrolysis. This is explained by the extraordinary difficulty in obtaining wxa~ate (i.e. 99.98%) combustion calorimetry data and the generally slow hydrolyses of unstrained amide linkages except for those amenable to catalysisby carboxypeptidase [53]. In contrast, highly reactive lactams such as 10 should be amenable to he~ of hydrolysis studies. Since heats of hydrolysis are usually two orders of magnitude smaller than heats of combustion, the ~ a c y required to obtain useful data is about 98-99%. Wiberg has employed this approach to great advantage in assessing the energetics of various carbonyl compounds [54,55]. Although unstrained amides and lactams hydrolyze too slowly for useful thermochemical measurements, ~ of hydrolysis have been indirectly determined by Wadso for such compounds, e.g. n-butylacetamide, by measuring the enthalpy of aminolysis, using n-butylamine, o f ~ c anhydride [56]. Direct heats of hydrolysis were, however, determined for more activated amides such as diacetylamideunder conditions allowing for direct comparison with unstrained, simple amides [57].
326 2.2. Bonding in Lactams: Is There Still A Role for Resonance? Resonance theory [ 1] has been s u ~ in explaining a wide variety of physical, chemical and spectroscopic properties of amides and laotams [2,41,58]. The plmuuity of the amide linkage, its high rotational barrier, low basicity, tendency for O-protonation, the short C-N bond length, stability toward hydrolysis, low IRc,arbonyl fiequency, ~SNand S3Ccla~cal shifts [58], and UV absorption properties have all been traditionally explained using resonance theory. A different model has been advanc~ by researchers [ 14-17,59] who proposed that the C-N s/gma bond in the planar structure is more polar covalent (stronger and shorter) than the C-N bond in the transition state and is, thus, the origin of the stabilization and high rotational barrier characteristic of amides. researchers correctly predicted that there is little variation in C=O bond length as a fia~on of C-N twisting apparently at odds with the resonance concept. X-ray studies confirm this prediction but the changes, s n ~ as they are, follow the trend predicted by resonance theory [35]. Penin [60] has criticized the atoms-in-molecules approach of these researcla~ particularly for highly electronegative atoms. Our ab m/t/o molecular orbital calculations (6-31G'//6-31G') predict Casymmetryfor 2quinuclidone (3) [35,36]. Furthermore, the c a l ~ e d structures [35,36] for 2-quinuclidone (3) and 1-azabicydo[3.3.1]normn-2-one (11) are consistent with the classical resonance model. The N-CO bond in 3, which lacks amide resonmr,e, is calculated to be the longest such bond reported while the C=O bond is short. In contrast, for the less strained 11, the N-CO bond is slightly longer than for unstrained amides and the C=O bond is somewhat longer than in 3. The variation of C=O bond length between lactams differing significantly in distortion is ratlmr small [14]. These details will be described later in this ~ . In collaboration with T. Darrah Thonm, Oregon State University, we employed gas-phase electron spectroscopy for chemical analysis (ESCA) to assess the nature of charge on N and O in distorted lactams [61a]. An earlier study of N-amnmnioimidates (solid state) found an excellent correlation between N~. shifts and the catbonyl fiequencies as prediO~l by resormn~ theory [62,63]. For l-azabicyclo[3.3, l]nonaw2-one (II), a low value for the ionization energy of Nl, compared to the planar 1-n-butylpyrrolidone (26) supports reduced positive charge on the nitrogen CH~TI2CH2CH3
l
O 1
26 27 in 11 consistent with reduced resonance in this twisted lactam While the differences are ~ they are outside the experimental error (0.05 eV). Similarly, the Oz, ionization energy in 11 is higher than in 26 due to reduced resonance in the former. The N~. core energy in 1,3-di-tert-butylaziridinone (27) is lower and the O1. core energy in 27 is higher than the r162 nux/el values in 1,3-di-tertImtylazifidinone [61]. 1,3-Di-tert-buWlazifidinone(27) is calculated to have a pyramidal nitrogen (as does the diadamantyi derivative [51 ]) and this is consisteat with its UV photoelectron spectrum
327 [64,65]. Our attempts at obtaining gas-phase ESCA data for the two 2-quinuelidone derivatives 7 and 10 were unsuccessful possibly due to thermal decomposition upon heating the sample to improve volatility [61]. {It should be noted that 6, 6, 7,7-tetramethyl-2-quinuelidone (10) is fairly unreactive with nucleophiles due to aerie hindrance [66]}. More recently, we [61b] have obtained results supporting these studies by examining the core potentials of the planar ground states and rotational transition states of formamide and dimethylaeetamide. The simplest explanation of the data is loss of the contribution of 4C with concommitant increased contributions from 4A and 411 [61b]. 2.3. Proton Aff'mities of Bridgehead Bicydic Lactams: N vs O
Unstrained amides and lactams protonate and alkylate on oxygen. This is often explained through reference to resonance structures like 4C alttrmgh the real point is the greatly enhanced resonance in O-protonated (alkylated) amides relative to the neutral amide [71]. In contrast, 2-quinuclidones such as 10 protonate and alkylate on nitrogen since they are really ketoarnines [32,33]. The ab initio calculafional results summarized in Table 1 predict that the 2.2.2 laetam (a trw~cyclohexene analogue), and the 3.3.2, 3.2.2 and 3.2.2 systems (all trans-cycloheptene analogues) protonate exclusively at nitrogen. One can see in this table the decrease in proton a t ~ t y at nitrogen as this atom approaches planarity and the increase in proton aifmity at oxygen as the N-CO twist decreases allowing enhanced pi overlap across this linkage. The 3.3.1 system, a trans-cyclooctene analogue, favors protonation at N by only 1.9 kcal/mol. However, the 3.3.2 system, also a transcyclooctene connected at the 1- and 5-positiom by a two-carbon bridge rather than a one-carbon bridge favors protonation at O. The two-carbon bridge "cuts a bit more slack" to the transcyclooctene-like ring. Interestingly, this prediction reeeives some support from experimental observations by Werstiuk et al [37] of the dichotomy outlined in Scheme 1, although it is not yet clear whether the very slow reaction is under thermodynamic or kinetic control [37]. Scheme 1 CH3OTf days
r
22
CH3OTf days CH3 24
328 Table 1. Calculated 6-31G'//6-31(3" proton ag~nities 0ccal/mol) for N and O on bridgehead bicyclic employing corrected zero-point vibrational energies and thenml factors (298 K) along with the difference (in kcal/mol) favoring N-protonatiotL (See Reference 36 for further details.) Lactam N-Prot0n Affmity ,, O-Proton Aff'miW Difference Favgrine N 2.2.2 (3) 228.9 206.2 +22.8 3.2.2 (30) 224.7 213.6 +11.1 3.2.2 (31) 223.6 214.4 +9.1 3.3.2 (32) 224.7 215.1 +9.6 3.3.1 (11) 219.0 217.6 + 1.4 3.3.2 (33) 214.1 221.2 -7.1 3.3.3 {12) 218.1 221.6 -3.5 2.4. State of Caiculational Studies of Distorted Amide IJnkages With the exception of a significant body of cakml~onal _a~_a_on beta-laOams [67], until recently almost no c~doAational work had been done on distorted laOams [2]. Moletmlar mechanics (MM) studies which treat distorted amides are based upon spectroscopic pmmnete~ which are only suitable for small distortions. MM studies of significantly distorted lactams are, thus, restricted due to the small body of suucawal d~_~referred to earlier and the complete absence ofthennochemical data for suitable bemhmadr compounds. Nevertheless, molecular mechanics calculations have been applied to selected bridgehead lactanx% partiodafly in the [rL3.1] series [13,36,40,48], with reasonable results for these relatively mildly sUain~ m o l ~ . Recently, a study appeared in which ab initio d~t_awere employed to parameterize molecular modumics for amide distortion [68]. More remains to be done in this are& ~ remains, ofccmrse, the question as to what level of theory is required. MNIN) calodafions on 1,3-di-tert-lmtylaziridinone (27) and 1-az~icyclo[3.3.1]nonan-2one 0 D [64,65], the series IS-20 [13,40] and 22 and 24 [37] imiicXe N-CO bond lengths that are naw,h too long [13,35]. We have employed ab initio MO calctflations to study amides. It is known that small basis sets are not appropriate since they do not properly treat N-inversion barriem [69]. We find that the 6-31G* basis set provides very good a g r ~ with experimental smmmral and energetics data, and have applied it sucxessfidly for full optimization of the b f i d g ~ laozms [35,36] and their protonated forms as well as to reprcxka~ the stmOamd properties [70,71] of a/pha-lactams [51,70-73]. Of course, although synthetic studies cited in this chapter and related synttmfic work on bridgehead laoams have been published [74-79], muc&work has yet to be done including a p p r ~ to the larger " h ~ l e " systems.
3. COMPUTATIONAL STUDIF~ 3.1. Molecular Mechanics: Our MM2 computations were perfommd on a Silicon Graphics Indigo2 xZ platform using the confonmfional analysis trackage from version 3.1 of the SPARTAN c o r o n a l chemistry
329 software [80a]. The molecular mechanics computatiom were undertaken with two pfitmry goals in mind. First, we wanted to carry out a low level conformer search to find the global minimmn, or lowest energy conformer ~ ) , before proceeding on to higher levels ofcalculafiom. Second, we wanted to evaluate the performance of MM2 when applied to bicyclic bridgehead laoams. The smaller members of our series have quite unusual geometries and we were curious about how well MM2 was paramete6z~ for such structures. We also started with three larger BBLs (16,28,29) in our series that were later discarded to conserve computational time.
The second question was answered ahnost immediately. When a starting structure close to the experimental g e o ~ for the 2.2.2 (3) BBL was submitted to the MM2 program a message was returned stating that certain ~ e r s were missing and the program terminated. Upon investigation, it was discovered that SPARTAN's version of MM2 had no parameters for a pyramidal nitrogen atom attached to a planar (sp2) carbon atom. The most obvious and certainly the most rigorous way to solve this problem was to derive new parameters from literature _dat_aor high-level ab initio computations. Unfortunately, neither alternative was viable since literature _d~t__aare scarce and the high-level ab initio calculations would be dependent on MM2 or semi-en~irical results for reasonable starting structures. In lieu of rigorous treatment, two 'quick-fix' methods were employed to get some results. The first involved systenmfic parameter duplication. Each missing l:aUmneterfor sp3nitrogen/sp2 carbon was supplied by copying the corresponding parameter for sp2 nitrogel~/Sp 2 carbon. The intended effect was to force MM2 to treat distorted amide bonds as if they were undistorted. While this was crude to the extreme, at least the computations would proceed. The second approach was to use starting structures with a planar amide linkage. This was done by starting with a planar linkage and building the rest of the bicyclic cage around it, checking only that connectivities were correct. For the smaller laOams, this often resulted in comically distorted starting ~ e s , but the MM2 program acc~ted and optimized them. Structures were submitted using either or both of the above methods, and no significant variations were detected. For both large and small ~ the energies were within ~ . 0 0 2 kcal/mol, bond lengths were within iO.001 A, and bond angles and torsional angles were within :~0.05~ ~ errors are certainly within ~ resolution, and are atmlmted to rounding or precisionerrorswithin the computation. All further moleodar mechanics calculations were carried out using both of these crude "patches." Now that structures could be op"mniz~ some questions arose concerning the conformer searches. In version 3.1 of the SPARTAN soi~ware package, the conformer searching algorithm employed the method of Osawa for rings. While it is quite clear from theory how the search should proceed when applied to a mono~clic system, the implementation for bicyclic systems in unclear.
330 Once a starting bond has been selected, how does the algorithm dmose which atoms to l m ~ e ~ Simple tests on bi~do[4.4.3]tridec~me showed that the conformers found depended on which bridge contained the selected bond, and furthermore, when more than one bridge was selected, still other conformers arose for each bicyclic species in our study: one set from each single bridge search, one set from each c~nbination of two bridges, and one set from seardfing all three bridges at the same time.
Further tests were carded out in an attanpt to disom~ w h e t ~ there was ~ ~ ~y ~ ~ which bridge would give the lowest-em~ confomzr every time. The results were i n c o m e . T I ~ seven sets of cmftmne~ were producaxi for ~ h m a i m S . We had ~ to use the "'Cadxmyi Substitmion N ' ~ Atom Replacemm~ (COSNAR) [36] (see ~ 2) isodesmic nmdel, so lowest energy confmmets were n e e t ~ not only for tl~ sedes oflactams, but for eacJa nmdd k~one, andne and alkane as well. In all,40 stmlin8 ~ were used, and 7 setsof conformers were generated from each starting struc~a'e. Fortunately, the computational time required for ew,h search was very small: the longest calculation took only one half-hour and most were considerably ~ . Still, well over 2000 total ccmfcmners were found and their energies and sUre,ares catalosed. The positive ~ for all ofthis bookkeeping was that we cotdd be fairly that we had ~ the lowest energy ccmformer in every case. At the very least, we were c r a m that we ~ not reasonably have been any more ~ or rigorous in our approac~ Table 2 lists these lowest energy conformation for each C O S N A R ~ ~ e used. Resonance energies were computed using the COSNAR m,xld ( ~ 2). Rec~ that the A Hf~ of a molecule may be comtmted from the sum of the Benson [81 ] increments and the strain energy. Thus the COSNAR resomnce energy may be viewed as having two parts (Schen~ 2): one arising ~om the difference in Benson increments and the other arising ~om the difference in strain energies. ~ 2
RE=Z* GI + Z * strain Z * indicates that the given quamifies for each compound are summed with our isodesmic sign convention, for exangfle: COSNAR APPROACH: E * GI - Z GI(amide) - Z GI(amine) - Z GI(ketone) + Y.GI (alkane) = 17.33 kc~mol (NOTE: This value is 18.2 kc~l ~ ~ ditfe~mt c o l o n s [36]) Because the COSNAR apprmeh ~ e s only those atoms in the amide linkage, it is easy to see that the tkmon resonance t a m is constant for an bicycac ~ at 17.33 kcal/mol. The loweste~ergy conformation strain ene~es were used to compute the COSNAR strain tmn, which was added to the Benson value to give a resonance energy value. The results are give in Table 3, and it is clear that they are too high, especially for the smaller BBLs. The A H ~ ) ' s for each ~ as computed for the Z GI's and the correo~ MM2 strain ene~es are also listed in Table 3.
331 Table 2 LEC Strain Energies for BBLs and COSNAR Model Compounds Computed with SPARTAN MM2 module (__allencr~es in kcal/mol~ System
Amide
~
K~onr
Amine
2.2.2 3.2.2 3.2.2 3.3.1 3.3.1 3.3.2 3.3.2 3.3.3 4.3.3 4.3.3 4.4.3 4.4.3 4.4.4
31.29 27.46 26.13 17.24 30.03 24.34 29.86 29.94 32.71 31.50 36.02 39.80 37.79
19.59 24.27
20.38 24.70 23.34 19.55 22.92 28.83 28.60 36.13 46.18 43.99 50.64 52.23 50.,41
21.77 27.15
(3) (31) (30) (11) (33) (32) (12) (13) (14) (16) (2g) (29)
18.28 29.99 37.35 48.71 54.90 56.55
20.68 32.61 38.86 46.27 47.26 49.04
In the 2.2.2 case, the x-system of the caJbonyl group is almost completely peqxadicular to the nitrogm lone-pair. This suggests that the resomtr, stabilization slmuld be very close to zero. However, the raw COSNAR resonance energy from M 2 is 8.60 kcaVmol. It is likely that this error arises because the moletxfle is simply too distorted for the MM2 parameter set to handle. Recall that we forced the MM2 algorithm to use undistorted N-CO ~ no matter how distorted the actual lactam was. Therefore, we assumed a resonance of zero in the 2.2.2. case, and applied the same linear correction (-8.60 kcal/mol) to all of the other resonance energies; these are the corrected resonance energies in Table 3. It should be noted that this is probably an overcorrection for the larger BBLs since they are more like the acyclic amides from which the pmmneters were developed. Even so, the 4.3.3 (13) and 4.3.3 (14) lactams are sligl~ hyperstable with respect to the COSNAR reference resonance energy of 17.33 kcaVmol. This mccmraged us to proceed on to higher levels ofcongmtatiorL It is also interesting to note that the 4.3.3 BBLs appear to represent a resonance energy maximum. This last point is examined in more det~ later. Some structural data for the BBLs are listed in Table 4, along with the corrected COSNAR resonance energies. These point to some obvious problems with the MM2 stmoa~es. First of all, the N-CO bond lengths show no sensitivity to increased resomnc~ (the carbonyi bond length is also invarim~ but this is ~ e d form previous experimental and computational studies [ 14,15,35,36]). This is not too s u r p ~ however, since variations in resoruur, e involve electronic effects and MM2 is based only on models from classical physics. Also, the N-CO bonds are too short for the smaller laotams. For example, the N-CO bond is calculated by ab initio (6-31G*) methods at 1.433 A in the 2.2.2 (3) BBL. This error is also not unexpected since the undistorted parameters for this bond
332 Table 3 Raw and Corrected M 2 COSNAR resonance energies for BBL series (all energies in kcal/mol) System LEC E_*.strain RE(raw)" RE(corr.) b A I-I~g~orr. r 2.2.2 (3) 8.73 -8.60 0.0 -27.1 3.2.2 (31) -0.11 -17.45 -8.83 -35.8 3.2.2 (30) 0.10 -17.23 -8.63 -37.2 3.3.1 (11) -4.71 -22.04 -13.71 -46.1 3.3.1 4.71 -12.63 -4.02 -33.7 3.3.2 (33) -7.12 -24.45 -15.85 -43.9 3.3.2 0 2 ) -1.36 -18.69 -10.08 -38.3 3.3.3 (12) -7.69 -25.02 -16.42 -43.2 4.3.3 (13) -11.03 -28.36 -19.76 -45.3 4.3.3 (14) -10.05 -27.38 -18.78 -46.5 4.4.3 (16) -6.99 -24.32 -15.72 -46.9 4.4.3 (28) -4.79 -22.12 -13.52 -43.1 4.4.4 (29) -5.11 -22.44 -13.84 -50.0 a) added Benson resonance energy term (17.33 kcal/mol) b) assumed xero resonance in 2.2.2 lactam - subtracted 8.60 k ~ m o l from raw RE's c) includes same 8.60 kcal/mol correction to strain energy used for RE's were used in all cases. Another problem is that the pyramidalization of the cad~nyl carbon appears to be quite high in all of the lagtams. Dunitz and note that this parameter is nearly always <5 ~ [38]. This may again be attn~otged to the nature of the model. Moleodar mechanics methods do not include the electronic effects that operate to keep this distortion small. There is a defa~e inverse conelafion between the resonance energy and the total mnoum of distortion in the BBLs. We provide a crude measure of the total distortion in Table 4, where we have listed the sum of the three angle distortion parmneters. Perhaps some sort of weighted average would be more appropriate, but the given values do show a nice correspondence with the computed resonance stabiliTafi'on, the smaller the distortion parameters. Furthermore, when a given parameter does not follow this general wend, there is compensation in one or both of the other two parameters. For example, the 3.3.1 (11) BBL has 4.88 kcal/mol greater resonance stabilization than the 3.2.2 BBL, but the nitrogen p y r a m i ~ o n is nearly the same. However, the CO pyramidalizafion and the twist angle are smaller for 11, and thus total distortion is less. R was mentioned above that the 4.3.3 lactams appear to represent a resonance maximum for the BBL series. It might be better to think of them as a distortion mininam~ Certainly, their Wadder and Dunitz parameters are smallest for all of the BBLs in the series. While one might logically predict that making the bridges larger would allow for more freedom of motion and consequently
333 Table 4 MM2 Structural Data for BBL Lowest Energy Conformers (bond lengths are in A and distortion parameters are in o) total" System RE (corr) rcN r~ ZCNC ~ ~co r trans-cyclohexene analogs 140.9 2.2.2 (3) 0.0 1.388 1.207 334.6 54.7 3.0 83.2 135.5 3.3.1 4.02 1.382 1.207 342.3 45.2 3.1 87.2 trans-cycloheptene analogs 3.2.2 (31) 8.83 1.390 1.208 347.9 39.2 21.1 38.4 98.7 1.387 1.208 349.6 35.0 20.2 42.4 97.6 3.2.2 (30) 8.63 1.383 1.210 355.6 23.0 22.8 44.7 70.5 3.3.2 (32) 10.08 trans-cyclooctene analogs 1.388 1.209 347.5 39.3 12.1 23.7 75.1 3.3.1 (11) 13.71 1.391 1.210 356.3 21.5 12.6 22.4 56.5 3.3.2 (33) 15.85 1.386 1.210 359.5 7.8 20.3 39.1 67.2 3.3.3 (12) 16.42 trans-cyclononene analogs 28.0 4.3.3 (13) 19.76 1.390 1.210 360.0 0.9 8.0 19.1 33.7 4.3.3 (14) 18.78 1.390 1.210 350.0 0.8 10.9 22.0 37.4 1.389 1.210 359.3 b 9.3 7.6 20.5 4.4.3 (28) 13.52 trans-cyclodecene analogs 38.1 1.391 1.211 359.2b 9.9 9.0 19.2 4.4.3 (16) 15.72 40.4 4.4.4 c (29) 13.84 1.398 1.211 359.9 4.3 11.8 24.3 a. sum of ZN, 2"co and r b. nitrogen is puckered in toward others bridgehead c. Bridgehead C-H is endo (inside bicyclic cage) of lowest-energy conformer of lactam. permit the total distortion to go to zero, this is contrary to the observed trend. What seems to be happening is that the bridgehead atoms are trying to fill the inside of the bicyclic cage. This can be seen in the inward puckering of the 4.4.3 (16) and 4.4.3 (28) BBL and amine nitrogens, and ~om the fact that the lowest-energy conformation of the 4.4.4 (29) BBL places the bridgehead H-atom inside the cage. In addition, many of the other low-energy conformers for the 4.4.3 and 4.4.4 model compounds show this endo-type geometry either at nitrogen or the bridgehead carbon. None of the 4.3.3 conformers exhibit this behavior. Perhaps the 4.3.3 structures represent a balance between
competh~ steric effects.
The BBLs in Table 4 are arranged according to the largest trans-cycloalkene type ring that can be formed in the bicyclic structure. It is well known that lowest energy conformers of bridgehead olefins always have the double bond in the largest trans-ring [2,12]. For example, in bicyclo[3.2.2]non-7-ene, the double bond could be in either a six- or seven-membered ring; in the lowestenergy conformation the double bond is found in the seven membered ring. Since the N-CO bond carries some double bond character (at least ~ r d i n g to the Pauling resonance theory of amides), it is reasonable that the BBL lowest-energy confonmfiom also have the partial double bond in the
334
largesttrans-ring. This behavioris observed for the MM2 lowest-energyconformations,and in
addition, the resonance energies correlate fairly well when the BBLs are grouped in this fashion, as do the distortion parameters and the total distortion values. ( ~ Table 4). To summarize, the MM2 lowest energy BBL conformers exhibit some trends that are consistent with our prediction of increasing resonance stabilization as a function of size in the BBL series. As bridge size increases, the laoams show a decrease in total distortion and an increase in the isodemfic COSNAR resonance energy. One interesting point that stands out is the apparent resonance maxinmm at the 4.3.3 cage geometry. Some of the structural trends are inconsistent with the anticipated trends, for example the invariance of the N-CO bond length to increasing resonance and the high ~ n y l pyrami~on values. A more thorough comparison of the MM2 results with semi-empirical and ab initio data is presented further on. 3.2. Ab Initio Calculations We will devote the least disoJsfion here to the results of our ab initio molecular orbital calculations since these have been extensively d i ~ in two papers [35,36]. While we again note that very extensive basis sets are required to properly treat N-inversion barriers [69], the 6-31G* basis set in the GAUSSIAN program series [82] does appear to reproduce structural data reasonably well [35,36]. The large systems treated here (up to 13 second-row atoms fully optimized at 6-31G*) were optimized on a Cray YMP-C916 supercomputer thanks to the hospitality of Cray Research, Inc. for David T. Moore. Table 5 fists total energies and key s t r u ~ parameters for nine bridgehead lactams as well as
some model compounds. The strikingfeatureshere include:a) the greatvariationin OC-N which ranges from 1.356 A in I-meth~p~olidinone (34) and 1.363 A inN,N-dilzethylacetamide(38) to 1.433A in 2-quinnclidone(3),b) the very slightvariationin the carbonyl (C=O) bond lengthfrom I.198A in 1-methylpyrrolidinoneand 1.202 A inN , N - d i n ~ d e to I.183 A in 2-
quinudidone, c) the variation in twist l~om nearly 0~ in normal amides and ~ to 90~ in 2quinuclidone, and d) the variation in p y r a m i ~ o n ~om virtually 0~ in the 4.3.3 system (13) to ca 56~ in 2-quinuclidone.
cH3 30
31
I
32
33 CH.~
o
C! I.~ N
34
35
36
37 ~ r
0
CH.~
335 Table 5 Optimized (6-31G*) ET (without and with Zero-Point Energy and Thermal Corrections) and Selected Geometric Parameters [28] for amides and Lactams (the Carbonyl-Containing Bridge Is Specified: e.g. 3.3.2 Signifies 1-Azabieyelo[3.3.2]deean-9-one. (See reference 36.)
lactam
-Er(au)
trarts-Cyclohexene Analogue 2.2.2 (3) 400.78202 trans-Cycloheptene Analogues 3.2.2 (31) 439.82191 3.2.2 (30) 439.82106 3.3.2 (32) 478.84770 trans-Cyclooctene Analogues 3.3.1 (11) 439.83632 3.3.2 (33) 478.86017 3.3.3 (12) 517.88266 trans-Cyclononene Analogues 4.3.3 (13) 556.91145 4.3.3 (14) 556.91758 Model Compounds 1-MePyr (34) 323.91275 Azet (35) 245.81043 pyr Azir (36) 206.72612 Pl Azir (37) 206.71929 N,N-DMA (38) 286.03017
eorr -Er (an)
rco (A)
rCO-N (A)
XN (deg)
Xco x (deg) (deg)
400.58594
1.183
1.433
55.6
0.0
90.0
439.59319 439.59199 478.58666
1.193 1.193 1.193
1.400 1.402 1.397
46.7 51.2 36.7
8.9 9.7 10.4
44.0 39.2 47.8
439.60731 478.59850 517.58912
1.196 1.200 1.200
1.386 1.374 1.372
49.9 32.6 18.9
6.0 7.1 12.0
21.9 20.0 35.6
556.58542 556.59167
1.205 1.205
1.359 1.362
0.3 5.0
7.5 8.1
19.6 20.5
323.75649 245.71880 206.66778 206.66212 285.88252
1.198 1.186 1.178 1.183 1.202
1.356 1.357 1.348 1.302 1.363
12.8
1.0
1.4
16.9
1.2
3.1
We have appfied the atomic i n c r ~ t technique developed by Wiberg [83] and advanced by Ibrahim and Schleyer [84] to calculate what we consider resonance energy loss ~om these systems. These data are presented in Table 6 and have been explained in greater detail elsewhere [36]. The use ofIbrahim-Schleyer ~ for the nitrogen in a~nides yields the data in Table 6 which would imply a total of 23.0 kc~mol of resomnc~ energy loss in 2-quinuclidone (3)'which will lack resonance stabilization due to symmetry. As noted in our original paper [36], a more refined co~n would subtract 2.2 kcaYmol ~om the RE loss data in the last column of Table 6. Thus, we feel that the RE loss in 3 is 20.8 kcal/mol, that in the 3.3.1 system (11) would be 10.3 kcaYmol and so on. Thus, the resonance energy of the normal tertiary ~ i d e (or lactam) linkage would be 20.8 kcal/mol (ca 21 kcal/mol) or the full RE loss of 3. The resomnce energy of the 3.3.1 system (11) would therefore be 20.8-10.3 or 10.5 kcal/mol. Interestly, the resonance energy ofthe 4.3.3 system (14) would be obtained by m b ~ 2.2 k ~ m o l as noted above, ~om-3.4 kcaYmol to yield an RE loss of-5.6 kcal/mol. Thus, the resonance energy of the 4.3.3 system (14) would be 20.8 - (-5.6) or 26.4 kcal/mol making it (thermodynamically) hyperstable via the COSNAR
336 definition. We have made the case that the practical consequence would be to make 14 kinetically stabilized with respect to nucleophilic addition although ring opening reactions would still be highly exo~c reflecting the strain in the bioyclic ring system [36]. Table 6. Calculation of the loss in resonance energy (RE Loss, kcal/mol) for bridgehead lactams using the full resonance benson group increments for amides (including, +7.6 kcal/mol for N(CO)(C)2,), Adding the experimental strain energies (strain) of the bicyclic frameworks (taken here as the bicycloalkanes) to the sum of the "Schleyer" Atom Increments (all in au) and comparing the resulting (ET)opt). (See Ref. 36 for further details.)
lactam
-Schleyer (au)
trans-Cyclohexene Analogue 2.2.2 (3) 400.7458 trans-Cycloheptene Analogues 3.2.2 (31) 439.7726 3.2.2 (30) 439.7726 3.3.2 (32) 478.7994 trans-Cyclooctene Analogues 3.3.1 (11) 439.7726 3.3.2 (33) 478.7994 3.3.3 (12) 517.8262 trans-Cyclononene Analogues 4.3.3 (13) 556.8530 4.3.3 (14) 556.8530 Model Compounds 1-MePyr (34) 323.8353 N,N-DMA (38) 285.9478
GIB,,,~ (kcal/mol)
Strain (kcal/mol)
-ET (est) (au)
-ET (opt) RE Loss (au) (kcal/mol)
-55.36
9.7
400.8186
400.78202
23.0
-60.29 -60.29 -65.22
13 13 17.8
439.8480 439.8480 478.8750
439.82191 439.82106 478.84770
16.4 16.9 17.1
-60.29 -65.22 -70.15
7.8 17.8 26.9
439.8562 478.8750 517.8951
439.83632 478.86017 517.88266
12.5 9.3 7.8
-75.08 -75.08
38 38
556.9121 556.9121
556.91145 556.91758
0.4 -3.4
-50.1 -54.4
--
323.9151 286.0347
323.91275 286.03017
1.5 2.8
We conclude this section with Figure 2 which shows the relationship between resonance energy, isodesmic equations and rotational barriers. If one uses an isodesmic reaction such as (1) to define resonance energy, then the value is ~ 18 kc,al/mol. V'a'am~ the same number is obtained by adding the Benson group i n c r ~ [81 ] for N(C)3 (+24.4 kcal/mol) and CO(C)2 (-31.4 kcal/mol) and subtracting the sum of N(CO)(C)2 (+7.6 kcal/mol [36]) and CO(N)(C) (-32.8 kcal/mol). Figure 2 also shows the enthalpy of activation for planar N,N-dimettrjlace~de's rotational barrier in the gas phase [85] (ca 16 kcal/mol, see discussion in reference 85). While both isodesmic ecp_wai'on1 and the rotational barrier have been used casually to defme resonance energy, both definitions are quite different. Isodesmic reaction I compares a pyrmnidal nitrogen (amine) to a planar nitrogen (amide) as it '~oreaks" and forms two different types of C-N bonds. The rotational barrier simplifies
337 O II RE = A Hf (CH3CCH3) +
O || A I-If(N(CH3)~) - A I-I~CffI6) - A I-k(CH3CN(CH3)2)
(1)
things by maintaining the OC-N bopA but compares a planar nitrogen (ground-state amide) to a pyramidal nitrogen (rotational transition state for amide),. However, if we add ca 6 kcal/mol for a typical amine inversion barrier we obtain about 22 kcal/mol for the resonance energy defined roughly as rotating a planar amide to a perpcntficuler amide by maintaining coplanarity at nitrogen. Obviously, reduction of overlap of the pi system to zero is also accompanied by significant lengthening of the OC-N bond and other more subtle changes. However, this 22 kcal/mol value compares well with the ca 21 kcal/mol (i.e. 20.8 kcal.mol) loss of resonance energy in 2quinudidone (3) defined in comparison to a hypothetical 3 maintaining its strain energy and full amide resonance. Figure 2 indicates that ifisodesmic equation 1 was employeed to generate the rotational transition state for N , N - d i m e t h y l ~ d e , the net enthalpy change would be roughly-3 kcal/mol. Here both amine and amide nitrogens are pyramidal and the exothermicity reflects comparison of the C-N bonds made and broken. If one wished to add the 6 kc~mol N inversion barrier then one could obtain ca 24 kcal/mol for the isodesmic reaction 1 in which a planar amine "reacts" with a ketone to yield the planar amide and alkane "detritus". This would roughly correspond to the COSNAR-defived resonance energy for the 4.3.3 system. The planar amide is derived from a planar amine since the 4.3.3 system forces virtual coplanarity at the bridgehead nitrogen- i.e. the amine "standard" has already "paid the price" of N inversion. 3.3. Semi-Empirical Results
Semi-empirical geometry optimizations were performed at the AM1, MNDO, and PM3 levels using all of the MM2 conformers with a Boltzman population of > 1%. MMOK amide correction factors were included, but all of the amides were sufficiently distorted that the MMOK corrections did not apply. These calculations were done on both the BBLs and their isodesmie partners so the COSNAR resonance energies could be computed. The calculations were carried out using the CADPAC module included with version 3.0 of the Unichem computational chemistry package running on a Cray C916 supercomputer. The structures were organized using the "'series" option included in the Unichem software so that identical computations could be performed on a large number of molecules at once. All of the op "tanized structures represent true minima with no negative frequencies. We had hoped that the semi-empirical calculations would maintain the conformer energies in the order predicted by MM2. Unfortunately this was not the case, although the energy differences were often smaller than those computed by MM2, especially for the smaller cage structures. In addition, the structure predicted to be the lowest-energy conformer by MM2 did not always maintain that status after semi-empirical optimization. Furthermore, the lowest energy structures computed by the three semi-empirical models did not always come from the same MM2 starting structure. In short, there was very little coherence between the MM2 and semi-empirical results.
338
0
H3C-..,.. "'~ N -
CH3
6 kcal/mol
reference energy
CH3COCH 3 + N(CH3) 3
-
C2H 6 H3C H3C. "%,
jo \ CH3 21 kcal/mol
18 kcal/mol 16 kcal/mol
22 kcal/mol
\N H3C/
C ~
~CH 3
..... !
'
-HypotheticalFull ResonanceCase
Figure 2. Relationship between resonance energies and rotational barriers (AH*) oftertimy amides Table 7 lists the energies of the lowest-energy conformers as computed with each of the three semi-empirical methods, irrespective of the MM2 starting structure. The last column also displays the resonance energy as computed using the COSNAR isodesmic scheme. Clearly the semi-empirical COSNAR resonance energies are quite different from the corresponding MM2 energies. None of the semi-empirical methods predict more than-- 10 kcal/mol of stabilization for any structure, and most of the resonance energies are <5 kcal/mol. Also, only the MNDO method matches the MM2 prediction that the resonance energy will be
339
Table 7 Resonance Energies from AM1, MNDO and PN3 Calculations with Energies for individual isodesmic model compounds (energies reported in kcaYmol for lowest energy conformers
only) System 2.2.2 (3)
3.3.2 (32)
3.2.2 0 0 )
3.2.20l)
3.3.1 (11)
3.3.2 0 3 )
3.3.3 (12)
4.3.3 (13)
4.3.3 (14)
4.4.3 (16)
4.4.3 (28)
4.4.4 (29)
Method AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3
Am_ide
AIk~e
Ketone
Amine
RE
-31.52 -33.61 -43.89 -44.88 -39.28 -49.76 -41.70 -39.34 -48.78 -41.44 -39.34 -48.76 -41.09 -43.53 -53.42 -48.88 -40.66 -51.11 -48.56 -37.15 -52.44 -51.7 -35.25 -53.91 -55.24 -36.90 -55.92 -56.56 -33.01 -57.38 -54.63 -33.37 -55.07 -62.42 -28.73 -62.13
-35.87 -26.34 -27.72 -41.17 -27.04 -28.32 -40.28 -29.46 -28.83 -40.28 -29.46 -28.83 -43.39 -33.04 -35.00 -41.17 -27.04 -28.32 -41.14 -21.69 -30.25 -38.17 -15.82 -28.37 -38.17 -15.82 -28.37 -38.79 -1.91 -26.93 -38.79 -1.91 -26.39 -49.77 -7.00 -36.51
-61.54 -53.23 -57.77 -68.79 -54.10 -59.85 -66.66 -56.47 -59.59 -65.85 -56.47 -58.70 -69.16 -58.23 -64.46 -67.62 -53.99 -58.61 -67.90 -48.35 -60.11 -65.38 -42.72 -59.21 -67.69 -43.51 -58.93 -65.84 -34.83 -56.50 -65.93 -37.23 -58.95 -78.42 -35.62 -66.60
-8.13 -3.56 -13.14 -19.99 -7.41 -17.47 -16.37 -8.09 -16.39 -16.37 -8.09 -16.39 -19.03 -11.84 -21.60 -19.99 -7.41 -17.47 -20.95 -3.99 -19.14 -20.49 -1.79 -18.53 -20.49 -1.79 -18.53 -19.46 0.54 -19.04 -19.46 0.54 -19.04 -29.88 5.87 -2_7.92
2.37 -3.16 -0.70 2.73 -4.81 -0.76 1.05 -4.24 -1.63 0.50 -4.24 -1.63 -1.54 -6.50 -2.37 -2.43 -6.31 -3.35 -0.85 -6.50 -3.44 -4.07 -6.56 -4.54 -5.23 -7.42 -6.84 -10.05 -0.64 -8.77 -8.03 1.40 -4.01 -3.90 -5.99 -4.13
340 largest in the 4.3.3 and 4.3.3 BBLs. The AM1 and PM3 models predict the greatest stabilization in the 4.4.3 BBLs. However, we do not place much stock in the disparity since semi-empirical methods are known to give unreliable energies. Furthermore, each COSNAR resonance energy value depends on four computed structures" an error in any one of them could significantly affect the resonance energy. The structural data for the semi-empirical BBL lowest-energy conformers are given in Table 8. With the exception of the AM1 structures for the 4.4.3 (16) and 4.4.3 (28) BBLs, all of the optimized lactam structures are significantly more distorted than ~ MM2 counterparts. Particularly notable is the 3.3.2 (32) lactmn, which has a twist angle of 85-90~ as opposed to 44.7~ in the MM2 conformer. This result implies that the added freedom of motion due to the increased bridge size does not allow the amide linkage to relax significantly from its highly strained conformation in the 2.2.2 BBL (3). If this result is accurate, it would imply that very little resonance stabilization is possible in the 3.3.2 BBL (32). Recall also that MM2 predicted nearly c,o-planar amide linkages in the 4.3.3 (13) and 4.3.3 (14) BBLs. The semi-empirical geometries for these compounds have linkages which are far more distorted, although the MNDO semi-empirical model predicts its smallest distortion parameters for the 4.3.3 (13) and 4.3.3 (14) BBLs. The AM1 and PM3 semi-empirical models predict that the 4.4.3. (16) and 4.4.3 (28) BBLs have the least distorted linkages. However, only AM1 predicts less total distortion than MM2. Thus, the absolute values for the resonance energy are not as large in the semi-empirical. The only case where there is anything close to an undistorted amide linkage is the MNDO geometry for the 4.3.3 (14) BBL, which has a nearly planar geometry at nitrogen. However, the twist angle is twice as large as in the MM2 conformer. The data in Tables 7 and 8 show that, for the most part, the semi-empirical results do not fit our predictions nearly as well as the MM2 results, with two exceptions. The N-CO bond lengths do shorten with decreasing distortion in the AM1 calculations, just as predicted in both the Pauling and Wiberg group models of the amide. However, all of the N-CO bond lengths are too long, even for the relatively undistorted BBLs. Furthermore, since the shortening of the N-CO bond is certainly tied to specific electronic effects, it is not at all surprising that semiempirical methods, which work with the wavefunctions of the valence electrons, work better than MM2 methods, which ignore wave functions entirely. The semi-empirical results also predict smaller pyramidalizafion angles for the CO carbons: all of them are within the experimentally observed limit of ~:5.0~ Overall, the absolute values from the semi-empirical results were quite disappointing. However, Table 9 lists the structural data and resonance energies by method, and it is quite clear that the data from all three methods follow the general trend of increasing resonance energy with decreasing distortion. Thus, even though the geometries looked too distorted and the resonance energies seemed too low, the relative behaviors of all quantities were as expected.
341 Table 8 Semi-Empirical Structural Data for BBL Lowest Energy Conformers (bond lengths are in A and distortion parameters are in o) System 2.2.2 (3)
3.3.2 (32)
3.2.2 (30)
3.2.2 (31)
3.3.1 (11)
3.3.2 (33)
3.3.3 (12)
4.3.3 (13)
4.3.3 (14)
4.4.3 (16)
4.4.3 (28)
4.4.4 (29)
Method AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3 AM1 MNDO PM3
rcN . 1.469 1.475 1.494 1.442 1.452 1.477 1.449 1.461 1.479 1.445 1.460 1.476 1.437 1.449 1.465 1.419 1.444 1.463 1.421 1.438 1.465 1.396 1.438 1.440 1.401 1.428 1.444 1.388 1.428 1.426 1.391 1.429 1.434 1.404 1.442 1.455
rco 1.230 1.216 1.207 1.236 1.226 1.211 1.235 1.219 1.211 1.236 1.221 1.212 1.238 1.223 1.215 1.243 1.224 1.212 1.241 1.223 1.215 1.248 1.224 1.221 1.244 1.227 1.220 1.251 1.229 1.226 1.249 1.228 1.222 1.246 1.226 1.216
ECNC 323.1 328.6 326.1 344.0 351.9 341.3 335.8 341.4 336.8 331.4 341.4 335.5 333.8 340.7 334.6 344.9 351.6 344.6 350.1 358.4 348.2 357.3 358.2 353.3 356.9 360.0 353.3 359.9 357.3 358.4 359.6 357.6 356.2 359.3 357.8 355.5
~.__ 63.3 59.0 60.9 43.5 31.3 46.6 59.1 45.9 51.5 55.4 47.9 54.5 56.2 48.8 55.4 43.7 32.8 43.4 35.3 14.7 38.2 19.0 15.3 29.7 24.1 1.9 29.2 4.6 19.2 14.9 7.4 17.9 22.7 5.1 17.6 24.3
~co 0.3 0.8 0.7 1.1 2.1 0.4 1.0 0.3 0.3 0.4 1.4 0.1 0.7 2.5 1.3 0.5 0.7 1.1 2.2 0.6 0.1 0.7 2.8 3.1 0.8 1.1 1.4 2.0 1.6 3.3 0.5 1.7 4.0 0.3 2.3 3.6
~ ..... 90.0 88.8 89.1 89.1 84.3 86.8 63.7 70.9 65.5 56.0 71.2 56.9 34.2 43.9 36.2 27.4 51.9 46.4 47.8 61.2 55.8 21.7 65.2 27.1 26.9 42.6 33.8 8.2 34.8 12.6 22.0 40.2 26.1 33.4 50.6 45.8
TotaP 153.3 148.6 150.7 133.7 117.7 133.6 123.8 117.1 117.3 111.4 120.4 111.5 91.1 95.2 92.9 71.6 85.4 90.9 85.3 76.5 94.1 41.4 83.3 59.9 51.8 45.6 64.4 14.8 55.6 30.8 29.9 59.8 52.8 39.2 70.9 73.7
342 Table 9 Semi-Empirical Data Organized by Method (energies in kcal/mol, bond lengths are in A, distortion parameters in ~ System rc~ reo ECNC _~ g~ x total" MNDO 2.2.2 (3) 1.475 1.216 328.6 59.0 0.8 88.8 148.6 3.3.2 (32) 1.452 1.226 351.9 31.3 2.1 84.3 117.7 3.2.2 (30) 1.461 1.219 341.4 45.9 0.3 70.9 117.1 3.2.2 (31) 1.460 1.221 341.4 47.9 1.4 71.2 120.4 3.3.1 (11) 1.449 1.223 340.7 48.8 2.5 43.9 95.2 3.3.2 (33) 1.444 1.224 351.6 32.8 0.7 51.9 85.4 3.3.3 (12) 1.438 1.223 358.4 14.7 0.6 61.2 76.5 4.3.3 (13) 1.438 1.224 358.2 15.3 2.8 65.2 83.3 4.3.3 (14) 1.428 1.227 360.0 1.9 1.1 42.6 45.6 4.4.3 (16) 1.428 1.229 357.3 19.2 1.6 34.8 55.6 4.4.3 (28) 1.429 1.228 357.6 17.9 1.7 40.2 59.8 4.4.4 (29) 1.442 1.226 357.8 17.6 2.3 50.6 70.9 AM1 2.2.2 (3) 1.469 1.230 323.1 63.3 0.3 90.0 153.3 3.3.2 (32) 1.442 1.236 344.0 43.5 1.1 89.1 133.7 3.2.2 (30) 1.449 1.235 335.8 59.1 1.0 63.7 123.8 3.2.2 (31) 1.445 1.236 331.4 55.4 0.4 56.0 111.4 3.3.1 (11) 1.437 1.238 333.8 56.2 0.7 34.2 91.1 3.3.2 (33) 1.419 1.243 344.9 43.7 0.5 27.4 71.6 3.3.3 (12) 1.421 1.241 350.1 35.3 2.2 47.8 85.3 4.3.3 (13) 1.396 1.248 357.3 19.0 0.7 21.7 41.4 4.3.3 (14) 1.401 1.244 356.9 24.1 0.8 26.9 51.8 4.4.3 (16) 1.388 1.251 359.9 4.6 2.0 8.2 14.8 4.4.3 (28) 1.391 1.249 359.6 7.4 0.5 22.0 29.9 4.4.4 (29) 1.404 1.246 359.3 5.1 0.3 33.4 39.2 PM3 2.2.2 (3) 1.494 1.207 326.1 60.9 0.7 89.1 150.7 3.3.2 (32) 1.477 1.211 341.3 46.6 0.4 86.8 133.6 3.2.2 (30) 1.479 1.211 336.8 51.5 0.3 65.5 117.3 3.2.2 (31) 1.476 1.212 335.5 54.5 0.1 56.9 111.5 3.3.1 (11) 1.465 1.215 334.6 55.4 1.3 36.2 92.9 3.3.2 (33) 1.463 1.212 344.6 43.4 1.1 46.4 90.9
RE -3.16 -4.81 -4.24 -4.75 -6.50 -6.31 -6.50 -6.56 -7.42 0.64
1.40 -5.99 2.36 2.73 1.05 0.50 -1.54 -2.43 -0.85 -4.07 -5.23 -10.05 -8.03 -3.90 -0.70 -0.76 -1.63 -2.34 -2.37 -3.35
343 3.3.3 4.3.3 4.3.3 4.4.3 4.4.3 4.4.4
(12) (13) (14) (16) (28) (29)
1.465 1.440 1.444 1.426 1.434 1.455
1.215 1.221 1.220 1.226 1.222 1.216
348.2 353.3 353.3 358.4 356.2 355.5
38.2 29.7 29.2 14.9 22.7 24.3
0.1 3.1 1.4 3.3 4.0 3.6
55.8 27.1 33.8 12.6 26.1 45.8
94.1 59.9 64.4 30.8 52.8 73.7
-3.43 -4.54 -6.84 -8.77 -4.01 -4.13
4. SUMMARY In brie~ ab initio moleoalar orbital c a l ~ o n s using the 6-31 G* basis set appear to predict good molecular geometries and reasonable experimental resonance energies. The "pure" resonance energy of a tertiary amide or lactam as predicted by a) the loss of resonance in 2-quinuelidone (3) or the rotational barrier of an amide to yield a hypothetical orthogonal transition state maintaining planarity at nitrogen is 21-22 kcal/mol. Molecular mechanics (MM2 in the Spartan 3.1 Program) provides reasonable overall conformational structures and surprisingly decent strain energies. However, the N-CO bond length is, not surprisingly, insemitive to distortion. The semi-empirical techniques do a surprisingly poor job of calculating distortion energies in this series. 5. ACKNOWI~DGMEN'I~ A. Greenberg acknowledges numerous stimulating discussions with Dr. Joel F. Liebman. He also gratefully acknowledges the seminal work of Dr. Linus Pauling on the structure of the amide linkage as well as Dr. Pauling's influence on him and on generations of chemists. David T. Moore gratefully acknowledges the Microelectronics Center of North Carolina (MCNC) for funding his stay at Cray Research, Inc. and to Dr. Susan Gustafson and Dr. John Carpenter, Cray Research, Inc., for helpful discussions.
REFERENCES
1. 2. 3. 4. 5.
PauFm~L., The Nature of the Chemical Bond, 3rd Ed, Comell Univ. Press, Ithaca, 1960, pp 281-282. Greenberg~A., in Structure arMReactivity, Vol. 7, Molecular Structure and ~rgelics, Liebrnan, J.F.; Greenberg~ & (Eds), VCH Pub., New York, 1988, pp 139-178. Manhas, M.S.; Bose, A_K., Beta-Lactams: Natural and Synthetic, Part 1, W-dey-Interscience, New York, 1971. Sheehan,J.C., Angew. Chem. Int. F_~ Engl. 1968 I.~es, R., Collect. Czech. Chem. CommurL 1938, 10, 148.
344 6. 7. 8. 9. 10. 11.
Yakhontov,L.N.; Rubsitov, M.V., J. Gert Chent USSR (Eng. Trans.) 1957, 27, 83. Hall,H.K., Jr.; El-She&eft, A., Chem. Rev. 1983, 83, 549. Muhlebach,A.; Lorenz~ G.P.; Cnamliclk V., Helv. Chim. Acta, 1986, 69, 395. V(mlder,F.K.; Dunitz, J.D., J. Mol. Biol. 1971, 59, 169. White, D.N.J.; Guy, M.H.P., J. Chem. Soc. Perkin Tran~ 2 1975, 43. Blaha, K.; Malon, P., Acta UniversitaPalacMnia Olomucensis FacultatisMedicae, Int. Org. Chem. Biochem., Czectt A c ~ Sci., 1980, 93, 81. 12. Greenberg, K; Liebrnan, J.F., Strairtut OrgardcMolecules, Academic Press, New York, 1978. 13. ~ , T.G.; Shea, K.J., I n A ~ s in TheoreticallyInteresting Molecules, Vol. 2, TNnnn~l, R.P. (ed), JAI Press, Greenwich, CT, 1992, pp 79-112. 14. Wiberg,K.B.; Lmtfig~ I~E., J. Amer. Chem. Soc. 1987, 109, 5935. 15a. Breneman, C.M.; Wiber~ K.B., J. Cong~tat. Chent 1990, 11, 361. 15b. Wiberg KB.; Breaeman, C.M., J. Amer. Chent Soc. 1992, 114, 831. 16. Wiberg~K.B.; Gtaser, P~, J. Amer. Chem. Soc. 1992, 114, 841. 17. Wiberg,K.B.; Rablen, P.l~, J. Amer. Chem. Soc. 1995, 117, 2201. 18. Woodward, R.B.; Neuberger, A.; Trenner, N.R., in The Chemistry of Pem'cillin, Clarke, H.T.; Johnson, J.R.; Robinson, R. (eds), Princeton Univ. Press, Princeton, 1949, pp 415-439. 19. Sweet, R.M.; Dahl, L.F., J. Amer. Chent Soc. 1970, 2, 5489. 20. Ramachandran, G.N., Biopolymers 1968, 6, 1494. 21. Walsh, C.T., Enz)cnes, McC~w-I-fill, New York, 1968. 22. Mock, W.L., Bioorg. Chem. 1976, 5, 403. 23. Harrison, R.K.; Stein, R.L., Biochemistry 1990, 29, 1684. 24. Stein, R.L., Adv. Protein Chem. 1993, 44, 1-24. 25. Liu, J.; Albers, M.W.; Chen, C.M.; Sc~et]aer, S.L.; Walsh, C.T., Proc. Nat. Ac.z~ Sci. (USA) 1990, 87, 2304. 26. Schreiber, S.L., Chem. Eng. News 1992, October 26, pp 22-32. 27. Schmid, F.X.; Mayr, L.R.; Mucke, M.; Schonbmnner, E.R., Adv. Protein Chem. 1993, 44, 25-66. 28. Somayaji,V.; Skorey, K.I.; Brown, R.S.; Ball, P~G., J. Org. Chem. 1986, 51, 4866. 29. pracejus, H., Chem. Ber. 1959, 92, 988. 30. pracejus, H., Chem. Ber. 1965, 98, 2897. 31. pracejus, H.; Kehlea, M.; Kehlen, H.; Matschiner, H., T e t r ~ t m 1965, 21, 2257.32. Levkoeva, E.I.; Nikitskaya, E.S.; Yaklmntov, L.N., Khim. Geterot Soe~ 1971, (3), 378. 33. ~ o e v a , E.I.; N ' ~ y a , E.S.; Yakhontov, L.N., Dog. Akaa~ Nauk SSR 1970, 192, 342. 34. Somayaji,V.; Brown, P~S., J. Amer. Chem. Soc. 1987, 109, 4620. 35. Greenberg~~; Venanfi, C.&, J. Amer. Chem. Soc. 1993, 115, 6951.
345 36. 37. 38. 39.
Greenberg,/L; Moore, D.T.; Dubois, T.D., J. Amer. Chem. Soc. 1996, 118, 8658. Werstiuk,N.H.; Brown, 1LS.; Wang, Q.P; Can d. Chem. 1996, 74, 524. Dunitz, J.D.; Winlder, F.K., Acta Crystallogr. 1975, B31, 251. BenneL/LJ.; Wang Q.P.; Slebocka-Tilk, H.; Somayaji, V.; Brown, KS., J Amer. Chem. Soc. 1990, 112, 6383. 40. ~ , T.G.; Shea, K.J., d. Amer. Chem. Soc. 1993, 115, 2248. 41. Ymnada, S., Angew. Chem., Int. Ed Engl. 1993, 32, 1083. 42. Gastone, G.; Bertolasi, V.; Bellucci, F.; Ferretf~ V., d. Amer. Chem. Soc. 1986, 108, 2420. 43. Cieplak,/LS., Struct. Chem. 1994, 5, 85. 44. Wang,/LH.J.; Missavage, ILL; Byrn, S.R.; Paul, I.C., d. Amer. Chem. Soc. 1972, 94, 7100. 45. Coll. J.C.; Crist, D.1L; Barrio, M.G.; Leonard, N.J., d. Amer. Chem. Soc. 1972, 94, 7092. 46. Maier, W.F.; Schleyer, P.v.IL, d. Amer. Chem. Soc. 1981, 103, 1891. 47. Mc,Ewen,/LB.; Schleyer, P.v.IL, J. Amer. Chem. Soc. 1986, 108, 3951. 48. Buclmnan, G.L.; Kitson, D.H.; Mallinson, P.IL; Sim, G./L; White, D.N.J.; Cox, P.J., J Chem. Soc. Perkin Trans. 2 1983, 1709. 49. Wang Q.P.; Bennet,/LJ.; Brown, KS.; Santarsiero, B.D., J Amer. Chem. Soc. 1991, 113, 5757. 50. MoCabe,P.H.; ~fflne, N.J.; Sim, G./L, J Chem. Soc., Perkin Trans. II, 1989, 1459. 51. Wang,/LH.J.; Paul, I.C.; Talaty, E.1L; Dupuy,/LE., Jr., J Chem. Soc. Chem. CommurL 1972, 43. 52. Rekharsky,M.V.; Nemykina, E.V.; Erokhin,/LS., Int. s Biochem. 1992, 24, 861. 53. Rawitscher, M.; Wadso, I.; Sturtevant, S.M., J Amer. Chem. Soc. 1961, 83, 3180. 54. Wiberg,K.B., in Molecular Structure andEnergetics, Vol. 2, Liebnm, J.F.; Greenberg,/L (eds), VCH Pub., New York, 1987, pp. 151-171. 55. Wiberg,K.B.; Crocker, L.S.; Morgan, K.M., s Amer. Chem. Soc. 1991, 113, 3447. 56. Wadso, I., Acta Chem. Scand 1962, 16, 471,479. 57. Wadso, I., Acta Chem. Sczmd 1965, 19, 1079. 58. Bennet,/LJ.; Somayaji, V.; Brown, KS.; Santarsiero, B.D., s Amer. Chem. Soc. 1991, 113, 7563. (See also a paper c o ~ 170 nmr oflactams: Boykin, D.W.; Sullins, .W., Pourahmady, N.; Eisenbratm, E.J., Heterocycles 1989, 29, 307.) 59. Laidig,K.E.; Bader, R.F.W.,J. Amer. Chem. Soc. 1991, 113, 6312. 60. Perrin, C.L., J. Amer. Chem. Soc. 1991, 113, 2865. 61a. Greenberg,/L; Thomas, T.D.; Bevilacqua, C.1L; Coville, M.; Ji, D.; Tsai, J.C.; Wu, G., s Org. Chem. 1992, 57, 7093. b. Greenberg,/L, Moore, D.T., J.Molec. Struct., in press. 62. Tuchiya, S.; Seno, M., d. Org. Chem. 1979, 44, 2850. 63. Tuchiya, S.; Seno, M.; Lwowski, W., s Chem. Soc. Chem. CommurL 1982, 875. 64. Treschanke, L.; Rademacher, P., d. Mol. Struct. (7HEOCHFs 1985, 122, 35.
346 65. 66. 67. 68.
Tresohanke, L.; Rademachcf, P., J. Mol. Struct. ( T H E t X ? H ~ 1985, 122, 47. Greenberg~A_; Wu, G.; Tsai, J.C.; Chiu, Y.Y., Struct Chem. 1993, 4, 127. Boyd, D.B.; Ott, J.L., J. Antibiot 1986, 39, 281. Dinur, U.; Hagler, KT., J. CongTutat Cheat 1995, 16, 154. (See also a paper on ~ ' o n for beta-laotams: Femande~ B.; Rios, M.K, J. Con~aat Cheat 1994, 15, 455). 69. Boggs, J.E.; N'm, Z., J. C o ~ ~ Cheat 1985, 6, 46. 70. ~ A., Chiu, Y.Y.; Johason, J.L.; L i ~ J.F., ~ Chem. 1991, 2, 117. 71. Greeat~lg K; Hsing, I,LJ.; Licbnmn, J.F., J Mo/e. Swuct ( T H E O C ~ , in press. 72. Greene, F.D.; Stowell, J.C.; Bergnuutq J.IL, J Org. Cheat 1969, 34, 2254. 73. Norskov-LaufiLs~ L.; Burgi, I-LB.;Hofinmm, P.; Schmidt, KIL Helv. Chim. Acta 1985, 68, 76. 74. Blackburn, G.M.; Skaifr C.J.; Kay, I.T., J. Cheat Rex, Miniprint 1980, 3650. 75. Somayaji, V.; Brown, ILS., J Org. Cheat 1986, 51, 2676. 76. Hall, HIC, Jr.; Shaw, ILG., Jr.; Dcutsdmum~ J Org. Cheat 1980, 45, 3723. 77. Steliou, K.; Poupatt, M K , J Amer. Cheat Soc. 1983, 105, 7130. 78. Deazer, M.; Ott, H., J. Org. Cheat 1969, 34, 183. 79. Bucluuum, G.L., J. Cheat Sot:. Perkin Tranx 1 1984, 2669. 80a. SPARTAN version 4.0 Wavefimcfion, Inc., 18401 Von Katman Ave., #370 Irvine, CA 92715 U.S.A, c1994 W a y - - o n , Inc. 80b. SPARTAN version 3.1Wavefimcfion, Inc., 18401Von Kannan Ave., #370 Irvine, CA 92715 U.S.A 81. Benson, W.W., ThermochemicalK/~t/c~ 2ndEd, J. VCrlley& Sons, 1976. 82. GtmxCcm90, M.J. Fdsch, M. Head-Gordon, G.W. Tracks, J.B., Foresman, H.B., Schlegel, K., R a g h a v ~ M.A. gobb, J.S., Binldey, C. G o ~ DJ. IMfiees, D.J. Fox, ILA. Whiteside, IL ~ ~ , C.F. Mclius, J. Bake, ILL. Martin, L.R. Kahn, J.J.P. Stewart, S. Topiol, and J.A. Pople, Gaussian, Inc., Pittshagh PA, 1990. 83. Wiberg, K.B., J.. Computat Cheat 1984, 5, 197. 84. Ibrahim, M.R.; Schleyer, P.v.R., J. Com~tat Chem. 1985, 6, 157. 85. Wiberg, K.B.; Rablen, P.R.; Rush, D.J.; Keith, T.K, J. Amer.Chem. Sot:. 1995, 117, 4261.
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
347
S o m e C h e m i c a l a n d S t r u c t u r a l F a c t o r s R e l a t e d to t h e M e t a s t a b i l i t i e s of E n e r g e t i c C o m p o u n d s Peter Politzer and Jane S. Murray University of New Orleans, D e p a r t m e n t of Chemistry, New Orleans, Louisiana 70148
1. I N T R O D U C T I O N
A continuing challenge in the design of new energetic materials (e.g. explosives and propellants) is to reconcile the necessary metastability with the desired insensitivity. Simply put, they should be sufficiently unstable to be capable of suddenly releasing a great deal of energy, but stable enough that this not happen before it is wanted! Since these two objectives are intrinsically somewhat contradictory, the realistic goal is an optimum compromise, which maximizes energetic performance while minimizing sensitivity to unintended stimuli. Some of the data relevant to the first issue can now be predicted with generally satisfactory accuracy; this includes density [1-3], heat of formation [4-6], detonation pressure and velocity [7,8] and specific impulse [9]. These properties permit meaningful assessments of the potential level of energetic performance. The prediction of sensitivity, however, continues to be an area of considerable activity. A key point, with regard to both issues, is the decomposition process of the compound: What are its energetics and how readily does it occur? Our emphasis in this chapter shall be upon factors that influence the ease with which decomposition can be initiated by unwanted external stimuli, i.e. sensitivity. These stimuli may be of various types, including impact, shock, friction, heat and electrostatic charge [8]. Relative vulnerabilities to these different effects need not be the same; for example, the onset temperatures for the thermal decomposition of TNT (1) and HMX (2) are quite similar, but the latter is much more likely to undergo detonation upon impact [8]. It has been shown, however, that there is a general correlation between impact and shock sensitivities [10], which are the ones upon which we will focus. Impact and shock sensitivities depend upon a variety of factors: chemical, structural and physical. We shall limit our discussion to the effects of molecular structure (i.e. chemical composition and molecular geometry).
348 Thus we will not directly address, for example, the mechanisms of "hot spot" formation in the crystal [11-13], or the role of particle size [10,12,14].
~ O2
H3 O2N
NO 2
J
N
r
i
O2N--N
1
<.)
NO 2
N --NO 2
N" I NO2
Impact sensitivity is taken to be inversely proportional to the height (h50) from which a given weight falling upon the compound has a 50% probability of producing an explosion. Shock sensitivity is directly proportional to the m a x i m u m gap w i d t h t h r o u g h which a s t a n d a r d shock wave has a 50% probability of causing an explosion. It is important to recognize t h a t both types of results are very dependent upon the specific physical conditions of the processes, and reproducibility can be a problem [8,12,15,16]. 2.
I M P A C T / S H O C K S E N S I T I V I T Y AND M O L E C U L A R S T R U C T U R E : SOME B A C K G R O U N D
2.1. Structure-Sensitivity Relationships The effects of molecular structure upon impact/shock sensitivity have been analyzed and reviewed in some detail on a n u m b e r of occasions [8,10,12,17]. Over the years, there have been frequent efforts to relate the experimental impact or shock sensitivities of groups of compounds of a given type (e.g. t r i n i t r o a r o m a t i c s ) to some molecular q u a n t i t y or q u a n t i t i e s . Molecular stoichiometry has proven to be remarkably effective in this context [10,15,16,1820], most notably Kamlet's oxidant balance formula [15,16,18], which is essentially a measure of the oxygen that is present relative to what is needed to convert all hydrogens to H20 and carbons to CO. In general, the larger is the oxidant balance, the greater is the impact sensitivity. In his extensive and very significant analyses of the impact sensitivities of energetic compounds, Kamlet emphasized the roles of C-NO2 and N-NO2 bonds as "trigger linkages", the cleavage of which is often a key step in decomposition processes [15,18]. Considerable e x p e r i m e n t a l evidence has accumulated in support of this view [21-35]. Kamlet also argued, with regard to C - N O 2 compounds, t h a t r o t a t i o n a r o u n d the C - N O 2 bond has a desensitizing effect since it reduces the amount of externally-provided energy
349 that can go into a C-NO2 stretching vibrational mode and promote bondbreaking [18]. Thus any steric hindrance of rotation can be expected to increase sensitivity. In this context, it is relevant to note recent studies indicating t h a t nitro groups have an enhanced capacity for localizing transferred vibrational energy [36,37]. Reflecting this emphasis upon the C-NO2 and N-NO2 bonds, several studies have sought to correlate impact and shock sensitivities with properties of these bonds, particularly various measures of their stabilities. These have included the lengths of the bonds [38,39], the electrostatic potentials at their midpoints [38,40,41], and their polarities in excited states of the molecules [42]. Taking a different approach, Kohno et al have argued t h a t the impact sensitivities of nitramines can be related to the differences between the N-NO2 distances in the gas phase and in the crystal [43,44].
2.2.SomeSpecificDecompositionPathways The various structure-sensitivity relationships mentioned above typically treat different classes of compounds separately. Nitramines, for example, are not expected to fit on the same correlations as n i t r o a r o m a t i c s or nitroheterocycles. Even w i t hi n a given class, however, a v a r i e t y of decomposition mechanisms may be operative. Indeed, Kamlet and Adolph found it necessary to establish two oxidant balance correlations for nitroaromatics [16], one being only for molecules having a CH-containing substituent alpha to a nitro group, 3:
~
\/
I
/C\ H C
3
~!\N02
4
/C\\NO2H I
Evidently the a- H is likely to be involved in the early stages of the decomposition process [16], perhaps moving to an adjoining nitro group to form a nitronic acid tautomer, 4 [35,45,46]. These are known to be reactive and unstable [47]. The transfer or loss of a proton to yield a nitronic acid or a n i t r o n a t e (aci) anion has also been invoked as the initial step in the decompositions of other energetic molecules, e.g. picric acid [35,48] and amine-sensitized nitromethane [49-51]. A p a r t i c u l a r l y i n t e r e s t i n g compound is TATB, 5, which shows a remarkable lack of sensitivity [10,16]. The decomposition of TATB proceeds, in its early stages, through furazan and furoxan intermediates [52,53], e.g. 6, the
350 formation of which is believed not to involve any significant net release of energy [10,54,55]. This presumably means that the progression to detonation is less rapid t h a n if these early steps were exothermic. This has been proposed as an explanation of the insensitivity of TATB [10,55]. As a contrast, TNT (1), which is m u c h more sensitive t h a n TATB, decomposes t h r o u g h the intermediate 7 (among others) [16,56,57], which is formed exothermically [54]. NH 2
NIO
O2N" H2N
NH 2 NO2 5
\
N
H2N
HC~
10
O2N
\ N
NH2 NO2
NO 2
6
7
A structural feature t h a t is frequently associated with instability is the presence of several linked nitrogens [58]. Depending upon the molecular environment, this can provide a relatively facile decomposition p a t h w a y through the loss of N2. Storm et al used this reasoning to i n t e r p r e t the observed sensitivities of some picryl triazoles [59], and we have shown computationally that the high sensitivity of the triazole 8 can be explained in an analogous m a n n e r [60]. On the other hand, some derivatives of the t e t r a a z a p e n t a l e n e s 9 and 10 have shown a surprising degree of stability [58,61,62]; we have speculated t h a t this is related to the relatively positive character of the two triply-coordinated nitrogens [63]. NO 2
/
1
N--N
H 8
9
10
It should be apparent from even this very brief discussion t h a t impact and shock sensitivities can depend upon a variety of chemical and s t r u c t u r a l factors. Any generalizations should be made cautiously, and are likely to be subject to qualifications and limitations.
351
.
RELATIONSHIPS
MOLEC~ 3.1. A n a l y s i s
BETWEEN
IMPACT
SENSITIVITIES
AND
SURFACE E!,ECTROSTATIC POTENTIALS
and Characterization
of Surface Potentials
We have shown in earlier work that it is possible to quantitatively relate a variety of liquid, solid and solution phase properties to the electrostatic potential patterns on the surfaces of the individual molecules [64-66]. Among these properties are pKa, boiling points and critical constants, enthalpies of fusion, vaporization and sublimation, solubilities, p a r t i t i o n coefficients, diffusion constants and viscosities. For these purposes, we take the molecular surface to be the 0.001 au contour of the molecular electronic density p(r), following the suggestion of Bader et al [67]. The electrostatic potential t h a t is produced at any point r in the space around a molecule by its nuclei and electrons is given by eq (1): V(r)= ~
ZA
A
p(r')dr'
IRA-rl- I VT;
(1)
ZA is the charge on nucleus A, located at RA. The sign and magnitude of V(r) are the net result of the positive and negative contributions of the nuclei and electrons, respectively, at the point r. For V(r) computed on the molecular surface, we determine the local maxima and minima (i.e. the most positive and negative values, VS,max and VS,min), the average deviation H and the total variance (~t2ot. The latter two are defined by eqs. (2) and (3)"
_
_
(2)
V ( r i ) - gS[
n.
(~2ot
=
(~+2
+02
_
=
(3)
-~-1~ [ V + ( r i ) - V ~ ] 2 + Vn = mi=l n
V s is the average potential over the entire surface: V s = 1 ~ V ( r i ) " V+(ri) n i=l and V - ( r j ) are the positive and negative values of V(r) on the surface, and V~ n V-rj.(~) and VS are their averages: -V~ = - -li__~ 1V + (r i) and V s = -1- j~l m
n.
352 We view H as an indicator of the internal charge separation, or local polarity, t h a t is present even in molecules with zero dipole moments. It has been shown to correlate w i t h dielectric c o n s t a n t s [68] and w i t h an experimentally-based measure of polarity [65]. The total variance, (~2ot, which is the sum of the positive and negative variances, reflects the spread, or range of values, of the surface electrostatic potential. It is particularly sensitive to the positive and negative extremes, because of the terms being squared. We have found (~2ot to be effective as a measure of a molecule's tendency for noncovalent interactions; for example, it enters into our expressions for boiling point [69], enthalpy of vaporization [64], solubilities [70,71], etc. 3.2. Unsaturated C-Nitro Derivatives: Nitroaromatics and Nitroheterocycles The electrostatic potential of a ground state atom is positive everywhere [72]; the nuclear term in eq. (1) dominates over that of the dispersed electrons. When atoms combine to form a molecule, some negative region or regions normally develop; these are often due to lone pairs on the more electronegative atoms, e.g. nitrogen, oxygen, the halogens, etc., but they may also reflect other factors, such as u n s a t u r a t e d or strained C-C bonds [73,74]. The introduction of the s t r o n g l y e l e c t r o n - w i t h d r a w i n g nitro group into an u n s a t u r a t e d molecule generally has the effect of eliminating the regions of negative electrostatic potential due to the ~ electrons. For example, whereas benzene and aniline have extensive negative potentials above the ring, nitrobenzene is positive everywhere except near the oxygens [75]. Analogous statements apply to the regions above the triple bonds of acetylene and nitroacetylene [76]. Two classes of u n s a t u r a t e d compounds t h a t are of interest in the present context, as energetic materials, are trinitroaromatics and nitroheterocycles, e.g. imidazoles and triazoles. The electrostatic potential on the molecular surface of 1,3,5-trinitrobenzene, 11, which is the p a r e n t molecule for the trinitroaromatics that we shall consider, has local maxima above the ring and near each of the C-NO2 bonds [77]. This pattern is modified somewhat when other substituents are introduced, but its basic features r e m a i n even under the influence of e l e c t r o n - d o n a t i n g groups, e.g. in 2 , 4 - d i a m i n o - l , 3 , 5 t r i n i t r o b e n z e n e 12, although the maxima are now less positive [76]. The NO 2
CH 3
NO 2 NH2
I
N--N O2N
NO 2
O2N
NO2 NH2
11
12
13
353 nitroheterocycles, such as 1-methyl-l,4-dinitro-l,3,4-triazole, 13, have surface potential m a x i m a n e a r the C - N O 2 bonds [77]. The only negative surface regions shown by any of these three molecules, 11 - 13, are associated with the nitro oxygens. The surface potential m a x i m a n e a r the C-NO2 bonds are a p a r t i c u l a r l y interesting feature of this linkage. Buildups of positive electrostatic potential above C - N O 2 bond regions have been observed in a v a r i e t y of types of molecules [75-84], and it has been d e m o n s t r a t e d t h a t t h e y can serve as channels for nucleophilic attack [78,84]. In a recent s t u d y of the computed electrostatic p o t e n t i a l s and related quantities on the molecular surfaces of 13 trinitroaromatic derivatives [77], we noted two rough trends; both the local polarity 17 and the surface potential m a x i m u m above the ring, VS,max, tend to increase as the impact sensitivity increases. Accordingly we u s e d a s t a t i s t i c a l a n a l y s i s p r o g r a m [85] to i n v e s t i g a t e w h e t h e r an a c c e p t a b l e q u a n t i t a t i v e r e l a t i o n s h i p could be developed. We found t h a t the sensitivity h50 could indeed be r e p r e s e n t e d in terms of H and VS,max(ring); our best expression was [77], h50 = a [FI2VS,max(ring)] -1 + ~l-I2 + ~/
(4)
In eq. (4), a > 0, ~ > 0 and y > 0. The linear correlation coefficient is 0.989 and the s t a n d a r d deviation is 14 cm, for e x p e r i m e n t a l impact sensitivities t h a t range from 36 cm to 320 cm. In view of the uncertainties associated with the measured values, we were pleased to obtain as good a correlation as this. Our success w i t h the t r i n i t r o a r o m a t i c s p r o m p t e d us to t r y the same approach with five nitroheterocycles (2 nitroimidazoles and 3 nitrotriazoles) [77]. These molecules do not have surface potential m a x i m a above the rings, so we used for each one the most positive of the m a x i m a n e a r its C-NO2 bonds, VS,max(C-NO2). These t e n d to increase as the i m p a c t sensitivity i n c r e a s e s , j u s t as was o b s e r v e d for the V S , m a x ( r i n g ) v a l u e s of the t r i n i t r o a r o m a t i c s . While 1-I does not show a p a r a l l e l t r e n d for these n i t r o h e t e r o c y c l e s , n e v e r t h e l e s s a s a t i s f a c t o r y r e p r e s e n t a t i o n of h50 was obtained involving both VS,max(C-NO2) and FI [77]: h50 = a [1-I2VS,max(C-NO2)] -1 +
(5)
In eq. (5), (z > 0 and [3 > 0. The linear correlation coefficient is 0.986 and the s t a n d a r d deviation is 19 cm, for experimental sensitivities between 35 cm and 291 cm. There is a notable similarity between the first terms on the right sides of eqs. (4) and (5).
354 A possible explanation for the role of VS,max(C-NO2) in eq. (5) is provided by our finding t h a t it shows a fair correlation with the computed dissociation energy of the C-NO2 bond, for a total of eight such bonds in a group of five nitroheterocycles [86]. When we sought to generalize this observation to a series of eleven nitroalkanes, we did find a relationship between the calculated C-NO2 dissociation energy and VS,max(C-NO2), which now also includes the molecular surface area [87]. (The fact t h a t the latter was not needed for the nitroheterocycles may mean simply t h a t those five molecules have similar areas.) Thus it may be that VS,max(C-NO2) in eq. (5) is an indirect measure of the strengths of the C-NO2 bonds in the nitroheterocycles; this would be consistent with the widespread emphasis upon C-NO2 and N-NO2 bonds as trigger linkages, mentioned earlier. For the nitroaromatics, it may be that the potential maximum above the ring reflects all three C-NO2 bonds, due to their symmetrical distribution. U n d e r s t a n d i n g the function of H in eqs. (4) and (5) poses a greater challenge. Eq. (5) and the very rough correlation between H and sensitivity among the nitroaromatics seem to suggest t h a t the l a t t e r increases with internal charge separation, which p r e s u m a b l y counteracts the stabilizing effect of electronic delocalization in these u n s a t u r a t e d molecules. We have advanced this a r g u m e n t in the past [77,87]. For the nitroheterocycles, however, even though H does appear in eq. (5), its values do not correlate in any obvious m a n n e r with sensitivity [77]. The form of eq. (4) also raises a question: The contributions of the two variable terms on the right side are both positive and of the same order of m a g n i t u d e [77]; yet H a p p e a r s in the n u m e r a t o r of one and the d e n o m i n a t o r of the other. This a m b i g u i t y concerning the role of H led us to reexamine the relationship of impact sensitivity to the molecular surface electrostatic potential. 3.3. I m p a c t Sensitivity and Surface Potential Imbalance The negative portions of a molecular surface are generally the smaller part of the total area, but they are frequently relatively strong; thus the average negative surface potential, V s - , is typically larger in m a g n i t u d e t h a n its positive counterpart, VS +, and the negative variance o 2 is g r e a t e r t h a n the positive, ~2. This can be seen, for a representative group of molecules, in Table 1. The exceptions tend to be those having several strongly electronattracting constituents, such as CF4 and p-dinitrobenzene. It is instructive to compare the last of these to nitrobenzene; the second nitro group actually decreases the m a g n i t u d e s of both V S- and ~2, because the polarizable electronic charge m u s t now be shared between two electron-withdrawing substituents. We have encountered analogous situations on a n u m b e r of previous occasions [76,81,88]. One consequence is that the value of H tends to level off as the n u m b e r of strongly electron-attracting constituents increases
355 Table 1. Some computed molecular surface quantities, a Molecule
12
VS -
+
VS _
o2
_
o2
cyclohexane
2.2
2.7
-1.6
2.5
0.7
C6H5CH3
4.6
4.2
-5.2
6.8
11.1
benzene
4.9
4.8
-5.0
7.1
9.2
(H5C2)20
6.7
5.9
-9.4
8.0
129.8
(H3C)3COH
7.7
6.2
-12.3
31.1
182.7
CF4
8.3
11.5
-4.6
66.9
2.9
pyridine
8.5
6.8
-13.0
18.5
212.3
C6H5OH
8.6
8.7
-8.5
63.8
73.7
CH3COCH3
9.4
7.7
-18.8
15.9
159.8
CH3COOCH3
10.0
7.4
-16.7
9.7
129.2
C2H5OH
10.1
8.1
-13.7
45.1
182.4
(H3C)2NCHO
11.1
9.3
-17.0
18.6
158.8
pyrimidine
12.0
9.9
-17.4
24.4
163.0
C6H5NO2
12.3
10.4
-22.1
16.7
105.2
CH3CONH2
14.7
12.5
-20.0
67.9
139.9
p-C6H4(NO2)2
16.5
17.9
-17.2
29.8
62.5
H3CCN
17.1
15.5
-22.2
23.6
167.8
H2NCHO
18.1
17.1
-20.0
85.5
233.6
CH3NO2
19.9
19.4
-21.4
34.4
81.7
H20
21.6
19.8
-24.3
85.7
161.8
m
aData are taken from references 65 and 68. 12, VS o+2 and ~2 are in (kcal/mole) 2.
+
m
and VS
are in kcal/mole;
[89]; thus, in going s e q u e n t i a l l y from benzene to t e t r a n i t r o b e n z e n e , the m a g n i t u d e s of H are 4.9, 12.3, 16.5, 19.5 and 21.4 kcal/mole. It is actually larger for H20, 21.6 kcal/mole, t h a n for tetranitrobenzene! Energetic molecules generally have several strong electron attractors, such as nitro groups and aza nitrogens, t h a t are competing for the polarizable charge. Accordingly they tend to have VS + > IV s - I and o+ > O2, as can be seen for the examples in Table 2. This is in m a r k e d contrast to the more typical situation, illustrated in Table 1. For the former, 12 levels off in the range 23 - 26 kcal/mole, and therefore cannot reflect the increasing sensitivity --
2
356 Table 2. Measured impact sensitivities and computed molecular surface quantities for some energetic molecules.a Molecule nitroaromatics: NO 2
--
-
2
g+
g2
H
VS +
VS
36.
19.1
23.8
-13.9
152.3
49.0
OH, H, H
87.
20.9
24.9
-17.0
174.3
87.5
H,H,H CH3, H, H
100.
19.5
23.9
-15.3
109.0
55.3
160.
18.3
22.1
-15.2
95.5
51.7
NH2, H, H
177.
18.6
22.0
-16.1
97.9
65.5
NH2, NH2, H
320.
17.1
19.4
-16.8
66.2
75.2
68.
20.7
27.4
-12.9
299.4
43.7
105.
23.3
28.6
-17.9
312.1
70.6
291.
20.1
23.4
-16.4
227.2
79.7
13.
23.4
29.6
-16.8
319.4
60.6
O2N
h50
NO 2 Z X,Y,Z
CHO, H, H
nitroheterocyclesH I
O2N,
C"
N
"C"
NO 2
~ Ii C-N
O2N" H I
H
"C"
N
"C"
NO 2
C-N O2N" H I
yo . /
N-N
H nitramines" NH-NO 2 H2C, NH-NO 2
357
NO2 H 2 c . - N , c . H2 /
O2N
26.
23.2
27.8-18.5
181.1
57.2
79.
16.4
18.3
-14.4
67.3
47.7
114.
22.6
24.2
-21.4
164.0
85.6
320.
17.4
18.8
-16.3
51.2
57.6
!
.N, C. N . H2 NO2
O., NO . --C- N-CH a I
NO2
ella O 2 N - N H - - ( C H 2 ) 2 - N - NO2 NO 2 O
II H3C-N--CH
aImpact sensitivities (in cm) are from reference 10; the computed d a t a are -+ -from reference 89. H, VS and VS are in kcal/mole; ~+2 and ( ~ 2 are in (kcal/mole) 2.
t h a t a c c o m p a n i e s a c o n t i n u i n g i n t r o d u c t i o n of e l e c t r o n - w i t h d r a w i n g constituents. Our analysis led eventually to the conclusion t h a t impact sensitivity can be related to this c h a r a c t e r i s t i c imbalance b e t w e e n the l a r g e r VS - + and the smaller [Vs-[ [89]. There are of course a variety of possible m e a s u r e s of such imbalance, including the q u a n t i t y v, defined by eq. (6), which we have used v =
2 2 G+G_
(Ot2ot)
(6)
effectively for this purpose in the p a s t [64-66]. By definition, v a t t a i n s a 2 >> a2 m a x i m u m value of 0.250 when g+2 _ G2 but approaches zero w h e n (~+ 2 or ~2 >> ~+. Among other possibilities are ratios and differences of VS + and - s-I IV or ~+2 and G2. We investigated a n u m b e r of options [89] and eventually settled upon eqs. (7) - (9) to represent the impact sensitivities of the original 13 n i t r o a r o m a t i c s and 5 nitroheterocycles, as well as a group of 8 n i t r a m i n e s . The good correlation coefficients and relatively low s t a n d a r d deviations are gratifying, given the uncertainties in the data bases.
358 nitroaromatics:
hoo-
+ (V )4 _(Vs)4
+Y
(7)
c~, [3 > 0; y < 0. Correlation coefficient = 0.990; s t a n d a r d deviation = 14 cm. nitroheterocycles:
h o-
(8)
a, [3 > 0; y < 0. Correlation coefficient = 0.998; s t a n d a r d deviation = 8 cm. nitramines:
h00=
(9)
a > 0; ~,y < 0. Correlation coefficient = 0.997; s t a n d a r d deviation = 9 cm. There is a unifying e l e m e n t in these equations, in t h a t t h e y all focus upon the relative m a g n i t u d e s of VS- + and ]~VS- [ o r ~+2 a n d ~2. The a d d i t i o n a l dependence upon VS,max in eq. (7) m a y be due to the s y m m e t r i c a l trinitrobenzene f r a m e w o r k t h a t is common to all 13 n i t r o a r o m a t i c s . In eqs. (8) and (9), the contribution of the first variable t e r m is larger t h a n t h a t of the second by an order of magnitude; the l a t t e r can be viewed as a correction term. A general conclusion t h a t seems to be indicated by these results is t h a t the a n o m a l o u s i m b a l a n c e b e t w e e n the s t r e n g t h s of the positive a n d n e g a t i v e surface potentials is at least s y m p t o m a t i c of the degree of i n s t a b i l i t y w i t h i n these t h r e e classes of molecules. It should be noted t h a t t h e v e r y strong positive potentials produced by the e l e c t r o n - w i t h d r a w i n g c o m p o n e n t s do not lead to as high levels of i n t e r n a l charge separation (and values of H) as m i g h t be anticipated, because the negative potentials are u n c h a r a c t e r i s t i c a l l y weak. 4. S U M M A R Y We have presented an overview of various a t t e m p t s to relate the impact and shock sensitivities of energetic m a t e r i a l s to their m o l e c u l a r s t r u c t u r e s . The objectives of such efforts are to b e t t e r u n d e r s t a n d the chemical a n d s t r u c t u r a l d e t e r m i n a n t s of these sensitivities, and to develop a predictive capability to facilitate the evaluation of new and proposed energetic compounds.
359 Our particular emphasis in this discussion has been upon the relationship of impact sensitivities to the electrostatic potential p a t t e r n s on the molecular surfaces. The current status of our analyses is r e p r e s e n t e d by eqs. (7) - (9). While the success of these expressions is pleasing, we certainly do not claim t h a t they are in final form. They reflect small data bases and m e a s u r e m e n t s with a relatively high level of uncertainty. As more compounds are included, it may well be t h a t the specific formulations given in eqs. (7) - (9) will be modified; i.e 9other functions of VSand VS - + or (~2 and g+2 may t u r n out to be more effective. What is important at present, however, is the unifying concept that we have found to apply to all three of these classes of compounds, namely t h a t their impact sensitivities can be related to the degree of imbalance between their typically stronger positive surface electrostatic potentials and weaker negative ones. ACKNOWI~DGMENT We greatly appreciate the financial support of the Office of Naval Research, t h r o u g h contract N00014-97-1-0066 and P r o g r a m Officer Dr. Richard S. Miller. REFEgENCES 1. 2. 3.
4. 5. 6. 7. 8.
9.
C.M. Tarver, J. Chem. Eng. Data 24 (1979) 136. J . R . Holden, Z. Du and H. L. Ammon, J. Comp. Chem. 14 (1993) 422. J . R . Stine, in Structure and Properties of Energetic Materials, D. H. Liebenberg, R. W. A r m s t r o n g and J. J. Gilman, eds., M a t e r i a l s Research Society, Pittsburgh, 1993, p. 3. M . D . Allendorf and C. F. Melius, J. Phys. Chem. 97 (1993) 720. D. Habibollahzadeh, M. E. Grice, M. C. Concha, J. S. M u r r a y and P. Politzer, J. Comp. Chem. 16 (1995) 654. P. Politzer, J. S. Murray, M. E. Grice, M. DeSalvo and E. Miller, Mol. Phys. in press. M . J . Kamlet and S. J. Jacobs, J. Chem. Phys. 48 (1968) 23. S. Iyer and N. Slagg, in Structure and Reactivity (Molecular Structure and Energetics), J. F. L i e b m a n and A. G r e e n b e r g , eds., VCH Publishers, New York, 1988, ch. 7. P. Politzer, J. S. Murray, M. E. Grice and P. Sjoberg, in Chemistry of Energetic Materials, G. A. Olah and D. R. Squire, eds., Academic Press, New York, 1991, ch. 4.
360 10.
11. 12. 13. 14.
15. 16. 17.
18.
19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.
C. B. Storm, J. R. Stine and J. F. Kramer, in Chemistry and Physics of Energetic Materials, S. N. Bulusu, ed., Kluwer, Dordrecht, The Netherlands, 1990, ch. 27. J. E. Field, Acc. Chem. Res. 25 (1992) 489. C. L. Mader, in Organic Energetic Compounds, P. L. Marinkas, ed., Nova, Commack, NY, 1996, ch. 3. C. M. Tarver, S. K. Chidester and A. L. Nichols, III, J. Phys. Chem. 100 (1996) 5794. M. J. Ehrlich, J. W. Wagner, J. Friedman and H. Egghart, in Structure and Properties of Energetic Materials, D. H. Liebenberg, R. W. Armstrong and J. J. Gilman, eds., Materials Research Society, Pittsburgh, 1993, 305. M. J. Kamlet, Proc. Sixth Symposium (International) on Detonation, Office of Naval Research, Report ACR 221, 1976. M. J. Kamlet and H. G. Adolph, Propell. Expl. 4 (1979) 30. J. S. Murray and P. Politzer, in Chemistry and Physics of Energetic Materials, S. N. Bulusu, ed., Kluwer, Dordrecht, The Netherlands, 1990, ch. 8. M. J. Kamlet and H. G. Adolph, Proc. Seventh Symposium (International) on Detonations, Naval Surface Warfare Center, Silver Springs, MD, Report NSWCMP-82-334, 1981, p. 84. R. Sundararajan and S. R. Jain, Ind. J. Tech. 21 (1983) 474. G. T. Afanas'ev, T. S. Pivina and D. V. Sukhachev, Propell. Expl. Pyrotech. 18 (1993) 309. J. Sharma and F. J. Owens, Chem. Phys. Lett. 61 (1979) 280. S. Bulusu and T. Axenrod, Org. Mass Spectr. 14 (1979) 585. M. Farber and R. D. Srivastava, Chem. Phys. Lett. 64 (1979) 307. F. I. Dubovitskii and B. L. Korsunskii, Russ. Chem. Revs. 50 (1981) 958. S. Zeman, M. Dimun and S. Truchlik, Thermochim. Acta 78 (1984) 181. A. C. Gonzalez, C. W. Larson, D. F. McMillen and D. M. Golden, J. Phys. Chem. 89 (1985) 4809. F. J. Owens and J. Sharma, J. Appl. Phys. 51 (1985) 1494. W. Tsang, D. Robaugh and W. G. Mallard, J. Phys. Chem. 90 (1986) 5968. C. Capellos, P. Papagiannakopoulous and Y.-L. Liang, Chem. Phys. Lett. 164 (1989) 533. T. D. Sewell and D. L. Thompson, J. Phys. Chem. 95 (1991) 6228. C. A. Wight and T. R. Botcher, J. Am. Chem. Soc. 114 (1992) 8303.
361 32.
33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52.
53. 54.
G. F. Adams and R. W. Shaw Jr., in Annual Reviews of Physical Chemistry, Vol. 43, H. L. Strauss, ed., Annual Reviews Inc., Palo Alto, CA, 1992, p. 311. T. B. Brill and K. J. James, J. Phys. Chem. 97 (1993) 8752, 8759. M. D. Pace, J. Phys. Chem. 98 (1994) 6251. J. C. Oxley, J. L. Smith, H. Ye, R. L. McKenney and P. R. Bolduc, J. Phys. Chem. 99 (1995) 9593. L. E. Fried and A. J. Ruggiero, J. Phys. Chem. 98 (1994) 9786. X. Hong, S. Chen and D. D. Dlott, J. Phys. Chem. 99 (1995) 9102. F. J. Owens, J. Mol. Struct. (Theochem) 121 (1985) 213. P. Politzer, J. S. Murray, P. Lane, P. Sjoberg and H. G. Adolph, Chem. Phys. Lett. 181 (1991) 78. F. J. Owens, K. Jayasuriya, L. Abrahmsen and P. Politzer, Chem. Phys. Lett. 116 (1985) 434. J. S. Murray, P. Lane, P. Politzer and P. R. Bolduc, Chem. Phys. Lett. 168 (1990) 135. A. Delpuech and J. Cherville, Propell. Expl. 3 (1978) 169; 4, (1979) 61, 121. Y. Kohno, K. Maekawa, T. Tsuchioka, T. H a s h i z u m e and A. Imamura, Combust. Flame 96 (1994) 343. Y. Kohno, K. Ueda and A. Imamura, J. Phys. Chem. 100 (1996) 4701. K. Suryanarayanan and C. Capellos, Int. J. Chem. Kinet. 6 (1974) 89. J. R. Cox and I. H. Hillier, Chem. Phys. 124 (1988) 39. A. T. Nielsen, in The Chemistry of the Nitro and Nitroso Groups, Part 1., H. Feuer, ed., Wiley/Interscience, New York, 1969, ch. 7. P. Politzer, J. M. Seminario and P. R. Bolduc, Chem. Phys. Lett. 158 (1989) 463. R. Engelke, W. L. Earl and C. M. Rohlfing, J. Chem. Phys. 84 (1986) 142. R. Engelke, D. Schiferl, C. B. Storm and W. L. Earl, J. Phys. Chem. 92 (1988) 6815. P. Politzer, J. M. Seminario and A. G. Zacarias, Mol. Phys. in press. J. Sharma, J. C. Hoffsomer, D. J. Glover, C. S. Coffey, F. Santiago, A. Stolovy and S. Yasuda, in Shock Waves in Condensed Matter, J. R. Assay, R. A. Graham and G. K. Straub, eds., Elsevier, Amsterdam, 1983, ch. 12. J. Sharma, J. W. Forbes, C. S. Coffey and T. P. Liddiard, J. Phys. Chem. 91 (1987) 5139. J. S. Murray, P. Lane, P. Politzer, P. R. Bolduc and R. L. McKenney, Jr., J. Mol. Struct. (Theochem) 209 (1990) 349.
362 55.
56. 57. 58. 59. 60. 61. 62. 63. 4.
65. 66.
C. B. Storm and J. R. Travis, in Structure and Properties of Energetic Materials, D. H. Liebenberg, R. W. Armstrong and J. J. Gilman, eds., Materials Research Society, Pittsburgh, 1993, 25. R. N. Rogers, Anal. Chem. 39 (1967) 731. J. C. Dacons, H. G. Adolph and M. J. Kamlet, J. Phys. Chem. 74 (1970) 3035. F. R. Benson, The High Nitrogen Compounds, Wiley-Interscience, New York, 1984. C. B. Storm, R. R. Ryan, J. P. Ritchie, J. H. Hall and S. M. Bachrach, J. Phys. Chem. 93 (1989) 1000. P. Politzer, M. E. Grice and J. M. Seminario, Int. J. Quant. Chem. 61 (1997) 389. R. Pfleger, E. Garthe and K. Raner, Chem. Ber. 96 (1963) 1827. R. A. Carboni, J. C. Kauer, J. E. Castle and H. E. Simmons, J. Am. Chem. Soc. 89 (1967) 2618. M. E. Grice and P. Politzer, J. Mol. Struct. (Theochem) 3 5 8 (1995) 63. J. S. Murray and P. Politzer, in Quantitative Treatments of Solute~Solvent Interactions, J. S. Murray and P. Politzer, eds., Elsevier, Amsterdam, 1994, ch. 8. J. S. Murray, T. Brinck, P. Lane, K. Paulsen and P. Politzer, J. Mol. Struct. (Theochem) 307 (1994) 55. P. Politzer, J. S. M u r r a y , T. Brinck and P. Lane, in
Immunoanalysis of Agrochemicals; Emerging Technologies, J. O.
67. 68. 69. 70. 71. 72. 73.
Nelson, A. E. Karu and R. B. Wong, eds., ACS, Washington, 1995, ch. 8. R. F. W. Bader, M. T. Carroll, J. R. Cheeseman and C. Chang, J. Am. Chem. Soc. 109 (1987) 7968. T. Brinck, J. S. Murray and P. Politzer, Mol. Phys. 76 (1992) 609. J. S. Murray, P. Lane, T. Brinck, K. Paulsen, M. E. Grice and P. Politzer, J. Phys. Chem. 97 (1993) 9369. P. Politzer, P. Lane, J. S. Murray and T. Brinck, J. Phys. Chem. 96 (1992 ) 7938. J. S. Murray, S. G. Gagarin and P. Politzer, J. Phys. Chem. 99 (1995) 12081. K. D. Sen and P. Politzer, J. Chem. Phys. 90 (1989) 4370. P. Politzer and J. S. Murray, in Theoretical Biochemistry and Molecular Biophysics: A Comprehensive Survey, Vol. 2, D. L. Beveridge and R. Lavery, eds., Adenine Press, Schenectady, NY, 1991, ch. 13.
363 74.
75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89.
P. Politzer and J. S. Murray, in Reviews in Computational Chemistry, Vol. 2, K. B. Lipkowitz and D. B. Boyd, eds., VCH Publishers, New York, 1991, ch. 7. P. Politzer, L. Abrahmsen and P. Sjoberg, J. Am. Chem. Soc. 106 (1984) 855. P. Politzer and R. Bar-Adon, J. Am. Chem. Soc. 109 (1987) 3529. J. S. Murray, P. Lane and P. Politzer, Mol. Phys. 85 (1995) 1. P. Politzer, P. R. Laurence, L. Abrahmsen, B. A. Zilles and P. Sjoberg, Chem. Phys. Lett. 111 (1984) 75. P. Politzer, P. Lane, K. J a y a s u r i y a and L. N. Domelsmith, J. Am. Chem. Soc. 109 (1987) 1899. P. Politzer and N. Sukumar, J. Mol. Struct. (Theochem) 179 (1988) 439. J. S. Murray and P. Politzer, J. Mol. Struct. (Theochem) 163 (1988) 111; 180 (1988) 161. J. S. Murray and P. Politzer, J. Mol. Struct. (Theochem) 180 (1988) 161. J. S. Murray, J. M. Seminario and P. Politzer, J. Mol. Struct. (Theochem) 187 (1989) 95. J. S. Murray, P. Lane and P. Politzer, J. Mol. Struct. (Theochem) 209 (1990) 163. SAS, SAS Institute Inc., Cary, NC 27511. P. Politzer and J. S. Murray, Mol. Phys. 86 (1995) 251. P. Politzer and J. S. Murray, J. Mol. Struct. 376 (1996) 419. P. Politzer, G. P. Kirschenheuter and J. Alster, J. Am. Chem. Soc. 109 (1987) 1033. J. S. Murray, P. Lane and P. Politzer, Mol. Phys., submitted.
This Page Intentionally Left Blank
Z.B. Maksid and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
365
VALENCE BOND THEORY A RE-EXAMINATION OF CONCEPTS AND METHODOLOGY Roy McWeeny Dipartimento di Chimica e Chimica Industriale Universits di Pisa, 56100 Pisa (Italy) 1. INTRODUCTION Nowadays, when immense computing power is available to everyone, there is a danger that the wish to exploit this power may prove stronger than the urge to look for new concepts and to formulate new theories. The situation was totally different when Pauling was writing his famous book on The Nature of the Chemical Bond, in which he systematically developed the Valence Bond (VB) theory and applied it in almost all areas of chemistry. He was not distracted by computers and like everyone else was compelled to look for simple mathematical models that could provide some kind of 'understanding' even in the absence of precise numerical prediction. He was brilliantly successful: most of the concepts he formulated, many of them pictorial in character, had an immediate appeal to chemists, whose thinking they have continued to guide even up to the present day. This Chapter attempts to show how modern valence bond theory has evolved during the last sixty years, from the primitive models employed by Pauling and his contemporaries to those of a more sophisticated and ab initio nature; and to show how well Pauling's ideas have stood the test of time. Indeed, as more and more elaborate calculations are performed, more and more of those ideas seem to find a rigorous non-empirical justification. 2. THE ELECTRON-PAIR B O N D - SOME PRELIMINARIES To introduce the concepts, we must of course start from the classic 1927 paper of Heitler and London [1], referred to by Pauling himself [2] as the "greatest single contribution to the clarification of the chemist's conception of valence ... since G.N.Lewis's suggestion that the chemical bond between two atoms consists of a pair of electrons held jointly by the two". The 'electron-pair' bond in the hydrogen molecule was described using a wavefunction of the form 9 ( A - B) = (XAXB + XBXA)(OLfl- flc~),
(1)
where XA, XB denote the ls orbitals of the hydrogen atoms A and B, while c~, fl are the spin functions for 'up-spin' and 'down-spin' states, respectively. For simplicity electronic variables (space and spin) have not been shown: they are taken to be in natural order rl, r2 or, for spins, sl, s2 in all products. Trivial normalizing factors will also frequently be omitted. The function (1) conforms to the Pauli principle, that must be antisymmetric with respect to an exchange of electrons 1 and 2, the spatial factor being unaffected when the electrons are switched between A and B, while the
366 spin factor changes sign and describes 'anti-parallel coupling' or 'pairing' of the spin angular momenta. The electron-pair bond in this way came to be associated with pairing of s p i n s - a concept that was useful in the days of simple pictorial models, but also became the source of much confusion. In fact, at the 'chemical' level of accuracy, the physical (i.e. magnetic) interaction of electron spins is quite negligible and it was soon realized [3] that the 'pairing' in a wavefunction such as (1) is simply an indicator of the symmetry of the spatial factor - the antisymmetry of the spin function requiring symmetry of the spatial function which, with a Hamiltonian in which the minute spin interactions are neglected, completely determines the electronic energy. Indeed, there is a second Pauli-compatible wavefunction in which the + and - signs in (1) have been interchanged and the resultant spin factor then indicates parallel coupling of spin angular momenta to total spin S = 1: the associated antisymmetric spatial factor then corresponds to an excited triplet state. Starting with just two atomic orbitals, XA, XB, it is evidenty possible to construct various other types of wavefunction. Thus
( A - B +) = XAXA ((~fl - fl~),
~ ( A + B -) = X B X B ( ~ f l - fl~),
(2)
represent 'ionic' situations in which both electrons have been assigned to the same atomic orbital, leaving the other one empty, with a corresponding shift of charge. In a standard variational calculation all three functions in (1) and (2) may be mixed, to yield an improved approximate wavefunction of the form
~VB = ~ ( A - B) + aC~(A-B +) + bC~(A+B-),
(3)
where a, b are numerical parameters, which are to be optimized so as to minimize the total electronic energy E = < ~1/:/1~ > ,
(4)
< ~1~ >
/2/being the Hamiltonian operator. Throughout this Chapter, we use the usual nonrelativistic Born-Oppenheimer Hamiltonian in which no spin operators occur and the nuclear positions are considered fixed. The variational energies calculated using functions (1) and (3) are shown in Fig.1 (a,b) as functions of the internuclear distance R: the Heitler-London function (1) correctly predicts a stable molecule, with roughly the observed bond length and a dissociation limit corresponding to two isolated hydrogen atoms. It is therefore basically satisfactory: it is considerably improved by admitting the two 'ionic' functions (2) but the common value of the parameters (a, b) is only about 0.12 at equilibrium geometry, falling to zero for R large. The function (1), corresponding to a single 'covalent structure', is thus adequate and the so-called 'ionic structures' (2) play a minor role: the bonding in H2 is essentially covalent. The association of covalent bonds with pairing of the spins of electrons on different atoms is the fundamental
367 premise on which the whole of valence bond (VB) theory is based. The alternative choice of signs in (1), corresponding to parallel-coupled spins, yields a triplet excited state whose energy curve indicates repulsion at all distances.
E/eh
/
-1.0
f
b
Figure 1. H2 energy curves (a) Heitler-London (b) Molecular orbital (c) Coulson-Fischer
-1.1
(d) Experimental
- 1.2
,I
oo
l
20
t
t
40
|
I
so
R/.o
It will be recalled that the approach of molecular orbital (MO) theory starts, on the other hand, from an 'independent-particle model' (IPM) in which both electrons occupy th~ same 'bonding MO', r - X A zr- X B , similar to the one used [4] for the hydrogen molecule ion, H +. The bonding MO is in fact the approximate wavefunction for a single electron in the field of the two nuclei; and allocating two electrons to this same MO, with opposite spins, yields the 2-electron wavefunction
=
3a).
(5)
The spins must be paired in this case, to satisfy the Pauli principle, the spatial factor r (rl)r (r2) being already symmetric. In the VB function, the electrons are allowed to 'change places', i.e. to be associated with either of the two nuclei A and B, by using the symmetrized product XA(rl)xB(r2) + XB(rl)xA(r2); but in the MO approach each electron is 'automatically' shared between the n u c l e i - because its orbital contains terms coming from both centres. In MO theory, then, bonding is not associated with exchange of electrons between differently localized AOs, but rather with the fact that the MOs themselves extend over more than one centre - the MOs are essentially delocalized over the centres and the bonding is associated with the
368 presence of each electron in the region where the AO of one atom overlaps with that of another. It is clear that the MO function (5) may be expanded in terms of the AOs XA, XB, to give ~'~MO : ~ ( A - B ) + ~ ( A - B +) + ~ ( A + B - ) , (6) a mixture of VB structures in which covalent and ionic terms all have the same weight. It is often said that the Heitler-London function (1) fails by leaving out the ionic structures and that the MO function (3) fails by giving them too much importance. The failure of the MO function is the more drammatic: it yields an energy curve (Fig.l) which behaves quite incorrectly away from equilibrium geometry and therefore offers no prospect of estimating a dissociation energy as the difference E ( c c ) - E ( R e ) ) . This failure of the MO approximation is quite general: without drastic correction it is incapable of describing systems in 'bond-breaking' geometries.
-
The Heitler-London function was improved by considering alternative 'electron configurations', with both electrons on one centre or the other, and then using the twoparameter function (3). In just the same way, the MO function (5) can be improved by adding configurations in which one or both electrons are promoted to the antibonding MO, r = XA -- XB, which has a nodal plane (r = 0) midway between the nuclei. The (symmetric) spatial functions for such configurations, which are loosely described as 'singly-excited' and 'doubly-excited', are respectively r162
+ r162
r162
By attaching the (antisymmetric) spin factor, for paired spins, we obtain 'configurational functions' (CFs) of singlet type; and these may be mixed with the 'reference function' (5), used as a first approximation to the ground state, to yield a variational function -
~1
+ A ~ 12
22
11 -~- ~ I I 11'
(7)
where subscripts and superscripts indicate the occupied MOs before and after electron promotion. When the parameters in the variation functions (3) and (7) are optimized, by minimizing the variational energy approximation (4) for every internuclear distance, a greatly improved energy curve is obtained (Fig.1 (c)), either expansion giving exactly the same results. This represents the best possible approximation that can be obtained from the basis provided by the orbitals XA, XB. In more general terminology, the orbitals used may be the first 2 members of a basis set, {X1, X2, .. Xi, . . . Xm}, comprising m 'basis functions' (not necessarily AOs), and the approximation obtained represents the 'basis set limit'. It appears that alternative expansions of the wave function, assigning the electrons to either AOs or MOs, can be mathematically equivalent - provided all possible configurations are admitted and the CFs used satisfy appropriate symmetry requirements (e.g. coupling of spins to desired spin multiplicity, 2S + 1). This conclusion is in fact correct and very general: the approach
369 being followed, originally due mainly to Slater, is nowadays known as 'configuration interaction' (CI) and provides the most widely used tool for electronic structure calculations. The choice of orbitals, used in defining the electron configurations, is immaterial provided they are linearly-independent combinations of the given basis functions - for example, MOs built up in LCAO approximation. The above considerations suggest a third type of expansion, proposed by Coulson and Fischer [5] and later by Mueller and Eyring [6], in which the MOs are replaced by 'semi-localized' orbitals of the form CA -- X A -[- ,~XB,
(8)
CB -- X B -~ I-tXA 9
In the case of H2, symmetry of the molecule suggests that # -- A; and for a small positive value of A the two orbitals will each show a distortion towards the other centre, becoming 'egg-shaped'. A generalization of the covalent structure function (1) is then
qiCF = (r162 + r162
/3C~).
(9)
The remarkable property of this function is that, when optimized by varying A at each internuclear distance, it exactly reproduces curve (c) in Fig.1 - even though there is now only one term in the CI expansion! In other words the bonding in the hydrogen molecule is purely 'covalent', with no participation of 'ionic structures'. This result should warn us against attaching too much 'objective' significance to the covalent and ionic structures used in the VB interpretation of molecular structures: the 'structures' are only terms in an expansion of the wavefunction and the relative importance of different terms depends entirely on the definition of the orbitals from which they are constructed. By expanding (9) in terms of the AOs XA, XB it is a simple matter to derive the form (3) and to obtain an expression for the coefficient a(= b) in terms of the parameter ~(= #) in (8): the ionic structures, when working in terms of conventional 1-centre A Os, may thus be implicitly included if we distort the orbitals in order to give them some 2-centre character. Before turning to many-electron molecules, it is useful to ask Where does the energy of the chemical bond come from? In VB theory it appears to be connected with 'exchange' of electrons between different atoms; but in MO theory it is associated with 'delocalization' of the MOs. In fact, the Hellmann-Feynman theorem (see, for example, Ch.5 of Ref.[7]) shows that the forces which hold the nuclei together in a molecule (defined in terms of the derivatives of the total electronic energy with respect to nuclear displacement) can be calculated by classical electrostatics, provided the electron distribution is represented as an 'electron density' P(r) (number of electrons per unit volume at point r) derived from the Schrhdinger wavefunction 9. This density is defined (using x to stand for both space and spin variables r, s, respectively) by P(r)
N ] ~I/(x, X2, ... XN)II/*(X, X2, ' 9x N ) d s d x 2 . . , J
dxN
(10)
370 and, more correctly, represents the probability/unit volume of finding an electron (no matter which) at point r. In this definition, the wavefunction (which may be exact) is assumed antisymmetric and the spin variables are eliminated in the integrations. The factor N arises because, whichever electron is at point x, integration over the other N - 1 variables must give an identical result ( ~ * being symmetric). The significance for chemistry of the Hellmann-Feynman theorem is obvious: the nuclei in a molecule may be regarded as held together, against the repulsion from their positive charges, by the attraction they feel towards any concentration of electron density between them. This interpretation provides a basis for most qualitative discussions of chemical bonding - see, for instance, Ref.[8]). An immediate application is provided by a comparison of the wavefunctions used above for the hydrogen molecule: all are constructed using the first two functions of a basis set X1, X2, ... Xm and a little consideration shows that all such approximations lead to an electron density of the form
P(r) = Z P~x~(r)x*s(r) ~,~
(11)
- a 'bilinear form' in which only the numerical coefficients, Prs, differ between one approximation and another. With just two basis functions, the hydrogen ls orbitals on the two centres, the electron density takes the form (all quantities being real) P(r) = P A A X A ( r ) 2 + P B B X B ( r ) 2 + 2 P A B X A ( r ) x B ( r ) ,
(12)
in which the first two terms are ls-type densities on centres A and B, with numerical 'weight factors', while the third term is an 'overlap density' concentrated mainly in the bond region and appearing with a weight factor 2PAB. The electron density is thus not just the sum of two hydrogen-like densities: the density in the region between the hydrogens may be considerably enhanced whenever 2PAB is large and positive. The importance of the overlap density, which we associate with bonding, is thus proportional to the 'bond order' P A B - a term used extensively in early applications of MO theory but equally applicable in a wider context. Indeed, the matrix P, which collects all the coefficients Prs in (11), is often called the 'charge and bond order matrix'. The off-diagonal elements are (at least formally) the bond orders for pairs of mutually overlapping orbitals; the diagonal elements may be called 'charges' because they measure the amounts of electronic charge associated with the corresponding orbitals. For a general many-electron molecule, equation (11) may be re-written in terms of normalized orbital and overlap densities dr(r) = xr(r)xr(r),
drs(r) = S r s x r ( r ) x s ( r )
(13)
(S~ being the overlap integral for Xr, Xs) and then takes the form P(r) - Z r
qrdr(r) + Z r<s
qrsdrs,
(14)
371 where (in the usual case of real orbitals and coefficients) qr = P ~ ,
(15)
q~s = 2S~sPrs.
Since integration over all space of the electron density P(r) must yield the total number of electrons, it follows that f P ( r ) d r - Z qr + Z qrs r r<s
-
N
(16)
and this gives a convenient means of visualizing how the electron density is spread out over the orbital and overlap regions [9]. In fact, the q's are nowadays usually referred to as 'electron populations' and this analysis of the density, further elaborated by Mulliken [10], is called "electron population analysis". When a population analysis is performed for the various wavefunctions used above, for the hydrogen molecule, it is found that the overlap populations are always positive (indicating a flow of electron density into the overlap region between the nuclei) but that the predicted enhancement of the density in the bond region depends on which function is used: for example, the MO and VB functions give bond populations qaMO
=
2Sab V/(1 + Sab)'
VB
=
2S2b V/(1 -~-S2b) "
(17)
The MO value is too large, by comparison with the value resulting from the best function used so far, while the VB value is a little too small. On the other hand, for the triplet state obtained by interchanging the signs in (1), the bond population becomes strongly negative: this result means that, with parallel coupling of the spins, electron density is 'scooped out' from the orbital overlap region and 'piled up' on the a t o m s - giving a very graphic interpretation of bonding and antibonding character. 3. CLASSICAL VB THEORY: PERFECT-PAIRING AND RESONANCE The 'classical' VB theory, as developed largely by Pauling and his many collaborators during the 'thirties and 'forties, is a direct qualitative generalization of the ideas introduced by Heitler and London in their hydrogen molecule calculation and its extension to m a n y - e l e c t r o n systems. This approach, which preceded Slater's introduction of determinants, rested upon group-theoretical foundations (see Appendix 3 of Ref.[7] for a brief summary of the required results). General treatments of spin coupling and of group theory may be found in the books by Pauncz [11] and Wigner
[12].
372 3.1 S y m m e t r y C o n s i d e r a t i o n s The wavefunction (1), antisymmetric under exchange of space-spin variables of the two electrons, was conveniently factorized into a product of space and spin factors: anti-parallel coupling of the spins (antisymmetric spin factor) then implied symmetry of the spatial factor and a consequent enhancement of electron density in the bond region. Such a factorization is unique to a 2-electron system: for an N-electron system, a Pauli-compatible function must have the form ~(xl,x2,... XN)= E O ~ ( r l , r 2 , . . .
rN)O~(Sl, S2,... SN),
(18)
t~
where the functions ~ all correspond to the same energy, while the O~ are spin functions for the same spin eigenvalues (S, Ms), each of the two sets providing a particular irreducible representation of the group of N! permutations of electron indices. This means, for example, that under the permutation operation/5, any O~ is mixed with its partners according to
Po~ - ~ O~D(P)~.
(19)
A
The set of spatial functions {O~}, which may be either exact degenerate eigenfunctions of the Hamiltonian o p e r a t o r / : / o r approximations to them, must then provide the 'associate' representation b in whicht (using ep for the parity factor, +1 for/5 even or odd)
D,~,~ : E op~b(P)~, = E ep,~D(P)~, A
(20)
A
The matrices D(P) with elements D(P)~ form a matrix representation of the group of permutations; and this representation is characterized by the labels S, Ms which indicate the total spin (S) and its z-component (Ms). From (19) it follows that, when an orthonormal set of spin eigenfunctions is available, the matrix elements D(P)~ may be expressed as D ( P ) ~ - < o~,IPlO,~ > (21) and this shows that the representation label Ms is redundant, since 'stepping-up' or 'stepping-down' the spin z-component in (21) leaves the matrices unchanged. The representation matrices are therefore commonly indicated by D s ( P ) . The index distinguishes the linearly independent spin eigenfunctions of given S, for any chosen Ms, which are mixed among themselves under the permutations; and the number of such functions is the dimension of the representation. t Strictly, the matrix whose elements appear in (20) should be IS(/b), which is the 'dual' (transposed inverse) of D(P). In the present context, however, the representations are real-orthogonal and the two matrices coincide; consequently IS(P) = eD(P).
373 In principle, it is easy to calculate the electronic energy for a wavefunction of type (18). With the assumptions that the spin functions are orthonormal and that H is spin-free, the energy expectation value is E
m
<
> < e, le>, >
~,~,,~ < r162
> < O,~lO,,, >
which at once reduces to
and finally to E = < O lf/L < 'I',~lr
>
(any a),
>
(22)
the last step following because (~ being a degeneracy index) the same term is simply repeated in both numerator and denominator, for each value of ~. 2
%. ()
1
2
3
4
5
6
7
8
N (number of electrons)
Figure 2. The branching diagram. In order to make use of (22) it is necessary to decide upon a particular O~, by choosing an appropriate 'spin-coupling scheme'; and then to construct its 'partner' (I)~. The standard irreducible representations Ds(/5) are carried, as in (19), by the 'branching diagram' (BD) functions indicated in Fig.2. But with one exception these provide no simple association with chemical bonding: the exception is the function set up by following the bottom path, in which the spins are coupled successively in pairs, possibly followed by a parallel-coupled sequence to give any desired total spin. Although Weyl and Rumer [13] had already explored the mathematical properties of such functions, it was Pauling [14] who first fully recognized their immediate applicability to problems of chemical bonding and gave simple graphical rules for making energy calculations. To obtain such rules it is necessary to set up orbital approximations to the formally exact eigenfunctions ((I)~) of the Hamiltonian.
374 The natural starting point is an orbital product gtk,,k2 .... kN = Ck, (rl)r
(23)
CkN,
in which Ck. is the orbital occupied by electron n and no orbital appears more than twice, in order that the Pauli principle may be respected. It is convenient to put all doubly occupied orbitals first and (as their corresponding electron spins must always be coupled up-down-up-down ... as in the bottom path in Fig.2) to make no further explicit reference to them. For purposes of defining and counting the independent spin eigenfunctions, N will thus correspond to the number of singly-occupied orbitals, whose electron spins are to be coupled to total spin S. The bottom-path spin function will be denoted by 01 (~ = 1) and it remains only to generate from (23) a combination of permuted products, namely 01, which will serve as a partner to O1 in (20). The bottom-path function is O 1 ( 8 1 , 82, ...
SN) -" 0(81, 8 2 ) 0 ( S 3 , 84)... 0 ( S 2 g - 1 , 82g)Ol(82g+l)... OL(SN),
(24)
in which there are g paired-spin factors, with O(si, sj) = [a(si)~(sj)-~(si)a(sj)]/v~, the remaining spins being parallel-coupled: it is a spin function with S = Ms 1 (N -- 2g) . A standard method of generating a function of given 'symmetry species', behaving like (I)~, is to sum all the permuted products /Sfl, formed from (23) by making the N! index permutations, with suitable coefficients. This recipe is well known_ in group theory [11] and embodied in the 'Wigner operator' (for representation Ds in (20)) 1
p~ =
-if(.
~ r
(25)
P which may be applied to an arbitrary function gt(rl,r2,.., rN) to 'project out' a component which will behave under permutations exactly like the function (I). in (20). Thus (dropping the superscript S, which will be fixed) 1
epDs(P),~Pf~(rl, r2, ... rN)
p ~ t ( r l , r 2 , ... r N ) = (f~v~ ~
\N!]
'~ 0n,
(26)
P
where ' ~ ' means "transforms like". In these equations, the normalizing factor contains 'Wigner's number' (2S + 1)N! ~ N + S + I ) W ( . ~1 N - S ) !
,
(27)
which is the number of linearly independent spin functions of given N, S and coincides with the number of distinct BD paths leading from the origin to the desired end point
375 with spin S. Evidently the simplest way of obtaining 01 is to put A = tr = 1 in (25) and to use (dropping the unessential normalizing factor) O(rl, r2, ... r N ) = E e p D s ( P ) 1 1 P ~ ( r l , P
r2, ... r g ) .
(28)
For future reference we note that the operators defined in (25) possess the properties P~Pt,~ = ~ t , P ~ ,
(29)
and that consequently p l l p l l = p l l i.e. pll is i d e m p o t e n t . The function (28) is a direct generalization of the spatial factor in the Heitler-London function (1): it derives from g orbital pairs (r162 (r162 associated with anti-parallel coupled spins; and each term in the sum will also contain a factor that refers to the un-paired orbitals (if any) whose spins will be parallel-coupled to give a state of total spin S. According to the Heitler-London-Pauling interpretation, it represents a perfectpairing a p p r o x i m a t i o n to the wavefunction of a molecule in which there are covalent bonds indicated by kl - k2k3 - k4 etc. and, for a non-singlet state, one or more 'unpaired electrons' which may endow the system with free-radical character. Clearly, different functions of type (28) will result from different choices of the paired orbitals ( k l - k2), ( k 3 - k4), ... etc. and hence, notionally, with different ways of allocating chemical bonds. All such functions possess the same Pauli-compatible symmetry (n = 1) and may thus be mixed to give, according to the variation principle, an improved approximation to an exact eigenfunction of the spin-free Hamiltonian. In some situations, a single 'structure' may be a good first approximation, giving a good estimate of the energy of a state of given S: this perfect-pairing scheme is then likely, as Pauling suggested, to give a good account of the actual chemistry of the molecule. And indeed most saturated molecules fall into this category, the bonding being well described in terms of overlapping pairs of singly occupied valence orbitals on the constituent atoms. In other cases, however, a single structure may be totally inadequate because there are several acceptable bonding schemes. The classic example is, of course, the benzene molecule. Here, disregarding all except the most loosely bound electrons (i.e. the r electrons, which are empirically associated with the special properties of aromatic systems), the problem is one of associating three (r) bonds with six carbon atoms. If the r-type AOs are denoted by r r ... r two obviously acceptable bonding schemes (cf. (24)) arise from ~ I --" r162162162162162 and ~ I I = r162162162162162 these are represented by the Kekuld structures in Fig.3,
Figure 3. Kekuld and Dewar structures for benzene.
376 where three other schemes (the Dewar structures) are also indicated. In such cases, in the terminology of the 'thirties, the state of the molecule is best represented as a 'resonance mixture of alternative VB structures'; and the (spatial) wavefunction of correct symmetry is written as (~ = CI~I + CII~II + CIII~)III -~-... CVI~VI,
(30)
where the ~' s are projected, according to (26), from the corresponding orbital products. It must be stressed that there is a unique association between the order in which the orbitals appear in a given gt and the pairing scheme which specifies a Weyl-Rumer (WR) function: whichever orbitals appear in the 1st and 2nd places have their spins paired, similarly for those in the 3rd and 4th places, and so on. Unless otherwise stated, the indices a, A, .. will from now on refer exclusively to W R functions and thus distinguish the different orbital products from which they are generated. It is now time to indicate how the qualitative ideas of VB theory can be put on a quantitative footing. The fundamental premises are simply (i) that valency is associated with singly-occupied orbitals on the constituent atoms of a molecule, more tightly bound electrons being assigned to doubly-occupied orbitals of a 'core'; and (ii) that the valence electrons may be well described using resonance mixtures of alternative spin-paired structures. For the moment, following Pauling and his contemporaries, we explore the implications of (ii) - noting only that (i) represents a remarkable result of empiricism, intuition and faith, to be fully justified only decades later. 3.2 C a l c u l a t i o n of t h e e n e r g y Let us consider first the single structure @(1 - 2 , 3 - 4 , . . .
) = pll~(rl,
r2,.., rN),
(31)
in which it is assumed that the spins of electrons in r r are paired, also those in r r and so on for g such pairs, and that the remaining N - 2g spins are parallelcoupled to a resultant spin S - ~1 ( N - 2g). The variational energy expression (22) (always for functions of type (26) with A - ~ - 1) readily reduces on using (29) et seq. Thus, noting that pll may be 'passed through' the operator /:/ (which is invariant under electron permutations) to give p21 ( - pl~) working on the righthand function, we find
E - M ~ epDs(P)ll < ~ [ / / [ P ~ >,
(32)
P where M is simply a normalizing orbital products. In the early days drastic a p p r o x i m a t i o n s - for this large number of 1- and 2-electron
factor and the matrix element involves only two of VB theory, no progress could be made without expression contains N! terms, each involving a integrals. The approximation made by Pauling
377 and others was that the orbitals could be considered roughly orthogonal i.e. that overlap integrals < Cries > could be neglected for s r r" this results in a drammatic simplification, only two types of contribution remaining. These are the so-called 'Coulomb integral'
Q ---< ~1/2/I~ > : J r162
/:/r162
drldr2.., dry,
(33)
which arises from the identity permutation, and g.. = <
IHIP.. > = < r162162162 >,
(34)
which arisesfrom the singleinterchange/Srs. The evaluation of (32) willthen proceed nicely, provided a formula can be found for the numerical coefficientDs(Pr,)11 which has not so far been considered. The evaluation of the required coefficient proceeds directly from (21)" for, in general,
Ds(h.)11 : < e, lPle, >=< e lef, >,
(35)
where @~ is a spin function obtained by permuting the spin variables according to the index permutation )5. As Rumer first pointed out [13], the overlap integral on the right (in which the function O~ is not in general a branching diagram function) is easy to evaluate by expanding (24), applying the permutation, and counting the number of terms that exactly match those in the un-permuted function. Rumer's graphical construction involves representing the spin indices by points on a circle (Fig.4) and linking those that refer to the paired spins, any remaining points (shown as bold dots) referring to the parallel-coupled spins:
6
5
Figure 4. Rumer diagrams and a superposition pattern. thus Fig.4a shows a function with two spin pairs, 1-2 and 3-4, and spins 5,6 parallelcoupled; Fig.4b indicates the effect of the spin exchange/523; and Fig.4c shows the result of superimposing the two diagrams. The value of the matrix element for any two WR functions is obtained [14] by inspection of their 'superposition pattern', as depicted in Fig.4c: in the most general case [15], the pattern may contain any number of 'islands' and 'open chains' (E-chains and O-chains connecting an Even or Odd number of points) and the required formula is
<
>- ~5(-1)'"~'2 (n'~'-g),
(36)
378 where u ~ = number of arrow reversals to achieve head-head, tail-tail matching n ~ = number of islands in pattern g
= number of paired spins in each function E
g ~ = 1 (no E-chains), or 0 (otherwise) It should be noted that the use of arrows (a -+ fl), rather than simple links, is in general necessary to ensure the correct phase (• of the result. Two superimposed arrows in a superposition pattern form an island, while an unlinked point counts as an O-chain. It is now a simple matter to derive from (32), using (35) and (36), an expression for the energy associated with a single structure: the final result (first derived by Dirac [16] in 1929)is
E=Q+ ~ Kr8 } ~ (rs) paired
K~- ~
(rs) uncoupled
K,.~,
(37)
(~) parallel
where (rs) refers to any pair of orbitals (counted once only) and "uncoupled" means that r, s are neither paired with each other nor parallel-coupled. The energy formula (37) formed the basis for a vast amount of work, at a semiquantitative level, in which the Coulomb and 'resonance' integrals (Q, Krs) were regarded as disposable parameters and assigned numerical values, characteristic of particular molecules and types of bond, in order to rationalize observed properties such as heats of formation and molecular geometries. Before considering the theoretical validity of this approach, however, we generalize to the case in which the perfect pairing approximation is inadequate and it is necessary to admit resonance among a number of VB structures associated with alternative plausible assignments of the chemical bonds. In the general case, the wavefunction will be expanded in the form (cf.(30)
- E
et seq) (38)
t~
where the 'best' expansion coefficients c~ will follow (according to standard variation theory) on solving a set of 'secular equations' which may be expressed in matrix form as
Hc = EMc.
(39)
The coefficients are collected in the column c, while the square matrices have elements H,~,x = <
~,~IHI~ >,
M ~ = < ~,~[~x > .
379 Here it should be noted M contains only the overlap integrals between the structures and describes the 'metric' of the 'vector space' which they span. With the same assumptions as in the derivation of (37) from (32), the expression for the matrix element H ~ reduces first to H,~:~ = QDs(I),r
(~)
where I denotes the identity permutation and, as before, the coefficients of Q and K~s are expressible as matrix elements according to (21) in which a,A now label WR functions. The effect o f / 5 , working on the righthand function, is to modify the superposition pattern for (I)~ and (I)~, and by applying (35) (with 111 replaced by aA I) it is possible to evaluate the required matrix element. The result takes its simplest form for singlet structures (S - 0) and then reduces to H,~ - M,~[Q + E 1
-
Krs
(r, s same island, odd number of links apart)
(40)
E
Krs
(r, s different islands)
2E
Krs
(r, s same island, even number of links apart)],
where M~)~= ( - 1 ) v ~ 2 (n'~-g).
(41)
The matrix element rules (40) and (41), which apply to the superposition pattern for any two structures(I)~, (I)~ for a singlet state, were first given by Pauling [14] in 1933. They were to form the basis of nearly all semi-empirical applications of VB theory to polyatomic molecules during the next few decades. Before considering the defects of the theory in the classical form just presented, it is worth noting that all the previous results may be embodied in an elegant "vector model", introduced by Dirac [16] and van Vleck [17]. This model refers formally to a spin-only system, whose Hamiltonian contains only spin operators and numerical parameters: the matrix elements of this model Hamiltonian, taken between pure spin functions, coincide with those obtained using the rules given above and thus lead to exactly the same secular equations and approximate energy eigenstates. The parameters in the model Hamiltonian are the coulomb and resonance integrals already defined and its form is simply i-Is = Q' - 2 E
K i j S ( i ) . s(j),
( Q ' - Q - ~1 E(~j) Kij).
(42)
(ij) The validity of (42) rests upon the well known Dirac spin-exchange identity, /hij = 1 ~[1 + 4S(i). S(j)], which is easily confirmed by testing the effect of each side on the
380 four spin products a(si)a(sj), a(si)fl(sj), etc.; and it is clear that (42) may be used with any choice whatever of spin functions, not only those of Weyl-Rumer form. 4. THE RISE AND FALL OF CLASSICAL VB THEORY The simple mathematical machinery developed in the last Section was used to great effect, throughout the 'thirties and 'forties, in applications ranging from the prediction of molecular geometries and the estimation of heats of formation to the detailed discussion of bond properties and even to the theoretical study of chemical reactions. In all these areas Pauling was a leading pioneer. To illustrate such applications and to introduce other important developments during this period, one example must suffice, namely that provided by the water molecule.
Example: The electronic structure of H2 0 The water molecule is visualized as arising from the reaction of an oxygen atom in its 3p ground state with two 2S hydrogens. If the electron configuration of the oxygen is indicated as O[ls22s22p22px2py], then the ls 2 represents an 'inner shell', the 2s 2 and 2p 2 represent two 'lone pairs', and the remaining 'valence electrons' will have their spins parallel-coupled (Hund's rule) so that the corresponding (positive) Krs in (42) will enter with a negative sign. If the two hydrogens approach as indicated in Fig.5 there will be two possible coupling schemes for the spins of the 4 valence electrons, namely (I) with pl,H1 and p2, H2 paired; and (II) with pl, P2 and H~,H2 p a i r e d both structures contributing to the observed singlet ground state of the molecule.
0 r
P~ Z
<.) HI
)P2
<-) Ha
Figure 5. Valence orbitals in the water molecule. The assumptions suggested by the Heitler-London calculation are that for any two orbitals involved in chemical bonding the corresponding exchange integral has a negative value, as a result of their non-zero overlap, and that this will become more negative as overlap increases. Structure I alone will give a good perfect-pairing approximation when the magnitude of the oxygen-hydrogen exchange integrals greatly exceeds that of the 'internal' oxygen-oxygen term, which is positive. This is indeed the situation at the observed geometry, which is 'V-type' as suggested by maximum overlap considerations; but if the bonds are stretched Structure II becomes more and
381 more important, its presence ensuring that the molecule will "dissociate correctly", leaving the oxygen in its triplet ground state. Fifty years ago the description of many aspects of chemical behaviour appeared much simpler than today. As the above example indicates, reactions apparently involved merely the uncoupling and recoupling of spins. And the use of a model Hamiltonian such as (42), containing only a few empirical parameters, made it easy to rationalize a large body of experimental data, even in a semi-quantitative way. Unfortunately, with the development of powerful c o m p u t e r s - starting during the 'fifties and continuing to the present day - the whole simple and elegant framework of classical VB theory was swept away and largely forgotten. The reasons, apart from the natural desire of exploiting computing power to make reliable and accurate predictions, were inherent in the theory as developed so far, which contains defects and inconsistencies to which we must now turn. Let us start by listing the guiding concepts underlying the application of VB theory in its classical form: 9 Only the most loosely bound electrons in an atom or molecule need be considered explicitly, the 'valence electrons' being separated from a 'core', whose presence is simulated by using a suitable 'effective' (valence-only) H a m i l t o n i a n - or, in practice, by assigning empirical values to the Coulomb and exchange integrals involving valence orbitals. 9 Overlap integrals are neglected, in all matrix element expressions, while exchange integrals are neglected except for pairs of orbitals assumed to be engaged in bonding; the latter have negative values, becoming more negative for orbitals that approach as closely as possible. This idea is formalized in a 'criterion of maximum overlap': the orbitals used in describing chemical bonds will be those that overlap most strongly, as in the example referred to above (Fig.4). 9 When a particular allocation of bonds appears to be overwhelmingly favoured on 'chemical grounds', the corresponding VB structure is accepted as an appropriate wavefunction and is referred to as a 'perfect-pairing approximation': when several possible allocations appear to be reasonable, the wavefunction is represented as a linear combination of corresponding structures (mixing coefficients being calculated from a set of secular equations) and, in Pauling's own words, the molecule is said to 'resonate among several bonded structures'. The energy of the resonance mixture is invariably lower than that of any of the single structures and the lowering of the energy is referred to as resonance e n e r g y - for a long time a key concept in organic chemistry (see, for example, Ref.[18]). 9 The validity of a perfect-pairing approximation depends on the particular choice of paired orbitals and may be improved by introducing new orbitals, as linear combinations of the original set, so as to lower the energy of the structure. Mixing of the orbitals in this way, in the context of VB theory, is usually referred to as hydidiza-
382
tion. Hiickel [19] introduced this idea, for s- and p-type orbitals, showing that the s-p mixtures had strong 'directional' properties which could enhance their overlap with neighbouring orbitals and thus improve the performance of a single-structure approximation. Pauling [20] independently put forward the hybridization concept, including also the d orbitals. He showed that a wide range of strongly directed hybrids could be formed and that they could be grouped into sets of high symmetry, well-adapted to the description of multiple bonds between, for example, a metal atom and a number of surrounding ligands: in this way he opened up the possibility of applying VB methods to molecules containing heavy atoms and, in particular, to the whole of transition metal chemistry. The inherent defects of VB theory in its classical semi-empirical form are closely related to the premises listed above. They are: 9 The complete disregard of atomic inner shells, except in so far as their effect may be 'simulated' by adjustment of numerical parameters, clearly provides no basis for more rigorous non-empirical developments. Even the application of variation methods to the valence electron system is fraught with dangers, the wavefunction being liable to 'collapse' into the region of the inner-shell 'core'. Early wbrkers were well aware of such dangers, typically referred to [3] as the "nightmare of the inner shells". 9 The neglect of overlap integrals is indefensible, both on numerical grounds (they frequently have values of 0.7 or more for paired valence orbitals) and from a point of view of internal consistency. Large numerical values - required for efficient bonding mean that the multiple exchange integrals, which result from permutations involving many electrons at a time (of which there are N!) can certainly not be discarded. And the assignment of negative values to the single-interchange Krs cannot be justified unless overlap is included. Thus, in the Heitler-London calculation (Sect.2), reduction of the exchange integral leads to K =<
XAXBIHIxBxA > = 2 S < XAIhJxB > + < XAXBIglxBXA > ,
(43)
of which the first term has nothing to do with 'exchange' being a one-electron integral, which contains kinetic energy and potential energy associated with the overlap distribution XAXB for an electron in the field of the nuclei: it is this term that is associated with chemical bonding, being large and negative, while the 'true' exchange term < XAXBIglXBXA > is essentially positive. To put S ~ 0 is to throw away the chemical bond! 9 The number of possible VB structures, particularly when those of 'long-bonded' and 'ionic' types are admitted, rapidly becomes astronomical for all but the smallest molecules: and, in situations where (even with carefully chosen hybrids) the validity of perfect pairing cannot be preserved, the technical difficulties of calculation appeared to be insuperable.
383 In summary, the major dilemma of VB theory in its classical form was that its beautifully simple concepts, which served many generations of chemists and were embodied in countless elementary textbooks on valency, could not be reconciled with the exigencies of actual calculation - at least with the means then available. Other methods, more easily adapted to ab initio development, began to overshadow, and eventually to totally eclipse, the area of quantum chemistry in which Pauling played such a major role. 5. MODERN VB THEORY During the last twenty years, there has been a resurgence of interest in VB theory and in the possibility of re-casting the basic approach in a form more amenable to ab initio calculation. Several distinct approaches have emerged, all of them (with present-day computational facilities) capable of competing with the more conventional methods, which normally have their roots in the independent-particle model (IPM) and are usually based on the use of molecular orbitals (MOs) and configuration interaction (CI). The new methods that result will be referred to collectively as forms of "modern VB theory", their common features being that (i) they start from the usual spinfree Born-Oppenheimer Hamiltonian (Sect.l); (ii) they use localized orbitals that are recognizably 'atomic' in character and usually overlap very substantially; (iii) where possible, they use spin eigenfunctions related to those of classical VB theory (those of WR type); and (iv) they proceed in a completely non-empirical way, using the techniques of variation theory, evaluating all one- and two-electron integrals and throwing nothing away. Some of these methods build the wavefunction from spinorbitals and employ Slater determinants, so as to exploit the familiar mathematical machinery developed during the last sixty years: but here we prefer to emphasize the classical VB approach and to show that the matrix element rules used by Pauling and his contemporaries are quite sufficient for making ab inito calculations, provided the multiple exchange integrals are all included. Before doing so, however, it is worth asking why overlap could not be eliminated from the start by employing, instead of the usual AOs, orthogonal linear combinations of them - for which the matrix element rules in their simple form would be rigorously valid. 5.1 V B t h e o r y w i t h o r t h o g o n a l orbitals 1
Soon after Lowdin [21] introduced his well known ' S - 2 prescription' for constructing an orthonormal set from an arbitrary set of AOs, modifying the original set as little as possible in a 'least-squares' sense, Slater [22] suggested that VB theory might be validated by introducing such orbitals and thus eliminating all the problems connected with neglect of overlap. At the same time, however, he repeated the Heitler-London calculation on H2 and reached a completely negative conclusion: with orthogonalized AOs, the single 'covalent' structure predicted no bonding at a l l - the energy curve indicating strong repulsion at all distances!
384 The reason for Slater's disappointing result becomes immediately clear on making a population analysis of the wavefunction [23]; when orthogonalized AOs are adopted, the bond population is found to be strongly negative at all distances. With hydrogen ls AOs, the population values at the equilibrium distance are qa = qb 0.638, qab 0.724, indicating a significant enhancement of electron density in the bond region; but on orthogonalizing the AOs the corresponding electron-pair function gives qa = qb 2.309, qab = --2.618, indicating that electron density is 'scooped out' of the bond region and 'piled up' close to the nuclei. In this way the 'formal' result - that the exchange integral (43) is positive and therefore gives no chemical b o n d i n g - receives a 'physical' interpretation through the charge density and the Hellmann-Feynman theorem. =
-=
=
Of course, on admitting CI by adding ionic structures it should be possible to reproduce the best energy curve shown in Fig.l; but the interpretation of the wavefunction will then be unconventional. In fact [23], using the orthogonal AOs, the best function is found to contain covalent and ionic structures with coefficients 0.781 and 0.442 respectively; and the bonding arises from the large matrix elements through which they interact. It is possible to extend the approach to many-electron molecules [24], deriving matrix elements between all possible types of covalent, ionic and muliply ionic structures, without making any of the approximations made in classical semi-empirical VB theory. The conclusions are general: covalent structures alone cannot account for bonding when they are constructed from orthogonalized AOs, the electron density in the overlap regions being essentially negative until ionic (i.e. 'charge-transfer') structures are admitted. One viable way of making fully ab initio VB calculations is thus to employ orthogonalized AOs: however, although early work [25] showed great promise (in calculations on both ground and excited states), the number of structures that must be admitted for a many-electron system rapidly becomes vast. For this reason, in spite of later ab i n i t i o verification [26] of the preliminary results, the approach has been little used and will not be considered further.
5.2 T h e " n i g h t m a r e of t h e inner shells" Before turning to VB methodology proper, it is essential to justify one basic premise underlying all early work, namely that attention can be confined to the 'valence electrons' alone: for VB theory in all current forms must confront a "N! problem" arising from the need to include all permutations in the calculation of matrix elements- and if inner-shell electrons are to be explicitly included this problem becomes insuperable. The approximate 'separability' of a general N-electron system into physically distinguishable subsystems A,B,..., each referring to its own group of N A , N B , ... electrons, is a familiar idea which permeates the whole of quantum chemistry: the idea is based on a wealth of experimental evidence and has been fully discussed elsewhere [27]. Briefly, a quantum system may be described as 'separable' when its wavefunction
385 may be represented, with high precision, as an antisymmetrized product of the form ~ ( x l , x2, ... X g ) = MA[OA(Xl,... XNA)OB(XNA+I, ... XNa+Ns)... ].
(44)
Here .~ denotes the usual antisymmetrizer, for all electrons, while in general OR is a 'group function' for the group of NR electrons in subsystem R. No exact wavefunction can be written in the form (44): for neither electrons nor basis functions (the global set from which ~ is constructed) can be uniquely partitioned among the subsystems (A,B,...R,...). The Pauli principle is clearly satisfied, since no particular electron labels are associated with any given subsystem; but the possibility of separating the orbital basis into subsets, which can each describe one subsystem with high accuracy, raises delicate questions of overcompleteness. To avoid such problems it is necessary to truncate the function space associated with each group of electrons, so as to eliminate excessive overlap between the functions of different groups. When the truncated sets describing different groups are mutually orthogonal, the subsystem wavefunctions are said to be strong orthogonal; and the wavefunction ansatz (44) may then be used to break down the energy of the whole system into terms for the separate groups, supplemented by pairwise interaction terms. In the present context of a 'core-valence' separation, there will be only two groups, C and V, say, containing core and valence electrons, repectively, and the energy expression takes the form E = Ec + E v + E c v = E c + E [ ~, (45) where E c depends only on the core function Oc and may be calculated without reference to the valence electrons. In the second form, E~~ (= E v + E c v ) contains the whole of the interaction between the electrons of the two groups and thus represents the energy of the valence electrons in an effective field provided by the core: this is the quantity considered in all the semi-empirical applications of classical VB theory - but now it may be defined and, in principle, calculated quite rigorously. The reduction is well known [28] and will be indicated only briefly. The electron density (10) is the so-called 'diagonal element' of a more general quantity, the (spinless) one-electron density matrix, P(r, r'), defined in exactly the same way except that the variables in ~ . carry p r i m e s - which are removed before the integrations. The reduction to (11), in terms of a basis set, remains valid, with a prime added to the variable in the starred function. For a separable wavefunction, the density matrices for the whole system may be expressed in terms of those for the separate electron groups; in particular, for a core-valence separation, !
!
P ( r l ; rl) = Pc(r1; r~) + P v ( r l ; rl),
(46)
which means of course that the total electron density may be obtained simply by superimposing core and valence contributions. When the analogous two-electron quantities are separated in a similar way, the expression for E~/ff in (45) takes a very
386 transparent form:
(47) = < OvI
\i'-1
s
+ ~
i, "--
g(i,j)
I(I)v >,
and this refers explicitly only to the N v valence electrons. The effective Hamiltonian for a single valence electron in the field of the core is
s
= h(i) + [Jc(i)- ~gc(i)],
(48)
where the coulomb and exchange operators (J,/~) have the general effect
3C(1)r
r (49)
/~c(1)r r
= / g(1, l')Pc(rl; r~)r
being an arbitrary spatial function.
The generality and importance of the above results cannot be overemphasized. The wavefunction for the valence electrons may be optimized by variation of (I)y alone, using the effective Hamiltonian in (47) with appropriate orthonormality constraints. In practice this means that, for a function built up from orbitals {r } of the valence space, it is only necessary to replace matrix elements < Crlh]r > by
< r162
> - < r162
> + ~ P,~[< r162162162 > - 89< r162162162 >], (50) i,j
where superscript c indicates the core orbitals and density. It is also clear that by interchanging core and valence labels in the preceding equations, it must be possible to optimize the wavefunction for the core in the effective field due to the presence of the valence electrons. In that way, by using core functions of relatively simple form (e.g. of SCF type), it is perfectly feasible to make an iterative optimization of the whole N-electron wavefunction, with little more difficulty than making a VB calculation on the valence electrons alone. This type of constrained optimization, in which the core and valence functions are allowed to 'float' (instead of being largely predetermined by an a priori choice of basis sets), is essential in order that the resultant wavefunction may give a realistic account of electronic properties and their dependence on molecular geometry.
387
5.3 V B t h e o r y w i t h n o n - o r t h o g o n a l orbitals Let us suppose that a good core function has been constructed and turn attention to the valence-electron system, the number (Nv) of electrons to be considered by VB methods now being relatively modest. It should be noted that the valence system may include a number of bonds and lone pairs and that the term 'core' is used in a general sense to include all other electrons e.g a number of different atomic inner shells or, in the context of organic chemistry, even the electrons of a strongly bonded 'framework' providing the effective field in which the 1r-electrons move. The 'separation' contemplated thus depends on the nature of the problem considered. The simplest possible approach to the construction of the VB wavefunction is essentially that outlined in Section 3; it is also perhaps the oldest, being very close to that proposed by Serber [29] in 1934. Briefly, one starts from a set of orbital products (using henceforth N in place of Nv) ~
= Ckl r
CkN,
(51)
electron labels assumed in natural order 1, 2, ... N, as usual, and projects from each product a function of appropriate symmetry as in (28). Each of the projected functions corresponds to a VB structure with links k i --+ k2, k3 -+ k4, ... and simply by changing the order of the orbitals one finds a whole set of VB structures. Noting that a will label the various orbital sequences klk2.., kg (t~p being the orbital in the pth place), a typical structure is
~
(52),
= p~iC~,~
where Pii is the Wigner operator (25) with n = A = 1. A general VB wavefunction then takes the form (38), namely @ = ~-~'~c~@~, where the coefficients are to be determined from a set of secular equations as in (39). The maximum number of linearly independent structures for a given configuration of N singly-occupied orbitals is given in (27) These structures may be chosen in various ways (see, for example, Ref.[7] p.245) and are relatively few in number; thus, for a singlet state of the benzene r-electron system (Fig.3), with N = 6, there are 5; and for N - 10 (e.g. naphthalene) there are 32. The structures all belong to the same orbital configuration specified by the choice of orbitals in the product (51), but there is no difficulty in dealing with multi-configuration wavefunctions in which, for example, 9 ~ is generated from an entirely different orbital product ~"~L = r r r To anticipate the use of multi-configuration functions, it is convenient to use K, L to denote orbital sets ki, k2, ... and ll, 1 2 , . . . , without reference to order, reserving n, A to distinguish the different coupling schemes within a configuration. Thus, = pa
,
=
(53)
will be spin-coupled functions from configurations K and L and the general matrix
388 element will reduce to (54) where the second step follows as in the derivation of (32). On inserting the expression for pll in (54), the general matrix element becomes 1
H~KL --< Ogl/:/IoL >=
~y.
E epDs(15)11<
Ftff]/:/]Pft L > .
(55)
P There is a similar result, on removing the/:/, for the overlap matrix element
MKL.
The evaluation of (55) is in principle very simple. The quantity < f~gl/~l/~f~L > involves only an orbital product (f~g) on the left and a permuted orbital product /5~tL on the right: it therefore reduces in terms of one- and two-electron integrals (< Ck, lhlr >, < Ck,r162 >, where for example the primed index l~ is that of the orbital into which ri is sent by the permutation/5; each integral being multiplied by a chain of overlap integrals. The other factor in (55) is trivial; it is given by (36) in terms of the Rumer-Pauling superposition pattern for the coupling schemes indicated by ~ and A. For up to about N = 10 the calculation presents no problems, depending only on the availability of efficient algorithms for generating the permutations, handling the superposition patterns, and accumulating contributions to the matrix elements. No storage of intermediate data is required and the first calculations of this kind [30] were indeed performed on a small PC. For N > 10 the direct sum over all permutations rapidly becomes prohibitive; but by exploiting the properties of the symmetric group [31] further progress can be made and calculations for up to about 20 valence electrons become feasible.This is a promising field of development. The last step in the calculation of a high-quality VB wavefunction, is the variational optimization of the orbitals. The procedure to be adopted for this purpose tends to be specific to the particular type of wavefunction and method of calculation, but two main approaches may be distinguished: (i) one in which equations defining a stationary point on the energy surface are set up and solved (as, for instance, in SCF theory); and (ii) one in which the parameters that determine the orbitals are systematically varied (by standard methods of non-linear optimization theory [32]) until a minimum is found. The procedures of type (i) may be efficient when practicable, but require heavy programming and are of limited generality; those of type (ii) tend to be more costly in computing time but are simple and completely general, requiring (at worst) only a means of evaluating the energy. This problem lies outside the scope of the present chapter.
389 5.4 C o n n e c t i o n with other m e t h o d s Although modern VB theory is by now well established, in various forms, it should be remembered that the foundations were all laid in the Pauling era. The first true multi-configuration theory was, in fact, that proposed by Serber [29]: it was capable in principle of ab initio implementation and was the precursor of several group theoretical approaches, mostly based on the use of BD functions, which we mention before indicating the connection with Slater methods.
Branching diagram and related methods Serber's procedure may be related to the approach of Sect.5.3 as follows. Instead of projecting symmetry-adapted functions (all for a - 1) from a variety of orbital products ft~ = klk2.., kg, one may use a single product (with the orbitals of the configuration in a standard order) and then apply the more general Wigner operators (25)" the resultant functions O(') = P-u will all transform like the basis function O. and for different values of # will provide a full set of linearly independent functions associated with different BD paths. The matrix elements that will determine their mixing will then be, for a single configuration,
where the first step follows because p ~ may be passed from bra to ket provided the operator is replaced by its adjoint (with order of the indices inverted); and the second follows from (29). On expanding the operator and putting a - A = 1, it follows that (apart from normalization) < r
> - Z < ftl/}lPf~ > Ds(/5)u ~' P
(56)
with a similar expression for the overlap matrix element. The matrices in the secular equation (39) thus take the form
H = ZHpDs(P), P
M = E MpDs(P),
(57),
P
where the numerical coefficients are Hp - < ~I/:/[P~ > and < ftl/~Ft >. This result was given in Serber's first paper; the multiconfiguration form, in which a configuration index (K, L) is added to each gt to give matrices in block form, followed in the second paper. The difference between (56) and (55) is important: (56) requires explicit knowledge of N! matrices, whose evaluation involves non-trivial algorithms [32] (see Ch.7 of
390 ref.[ll]), for the representation provided by the BD functions; (55) requires only the l 1-element of each matrix, for a pair of WR functions, whose evaluation is trivial. It is not surprising that Serber's method was never pursued in actual computations; but the same general approach, often with sophisticated algorithms for generating the representation matrices, underlies a number of current methods. The main developments in this general area, largely due to Goddard, Gerratt, Gallup and their many collaborators, are well reviewed elsewhere [33],[34],[35].
Methods based on Slater determinants As in the discussion above, let us use the operator (25) to extract, from an arbitrary f~ a component behaving like (I)~,namely
~(~) = p~f~ = ~
epDs(P)~P~2 ~ r
(57)
P (omitting the trivial normalizing factor): there will be several such functions depending on which column (A) of the matrix D s ( P ) Has been taken. The fundamental result (18) may now be used to construct a totally antisymmetric wavefunction with spin included. This will be, for any choice of A,
p
and, by making use of the representation property (19), the quantity in parentheses may be replaced by PO~. Finally, then, ~ ( x l , x 2 , . . . X N ) = A[ft(rl, r2,.., rN)O~(Sl, S2,... 8N)
(58)
will be a fully antisymmetric space-spin function, formed by applying the antisymmetrizer (with the conventional normalization such that ~2 _ A) A = (N!) -1 E cpP P
(59)
to the product ft(9~, for any chosen spin eigenfunction. Since Oh is a general BD spin eigenfunction and may be expanded in terms of products such as a(sl)~(s2).., a(SN), it is clear that ~ in (58) is a spin eigenfunction formed as a linear combination of Slater determinants, the first one being ICtc~ r CNOLNI-i.e. an antisymmetrized product of spin-orbitals. It is thus perfectly possible to perform VB calculations using standard Slater methods and this approach has been used quite widely, first in the early work [36] and more recently in [35] and in the work of Balint-Kurti, van Lenthe, and their collaborators [37]. Unfortunately, the use of Slater determinants does not lead to any great reduction in computational problems, because (i) the numbers involved may be rather large; and
391 (ii) the matrix elements of the Hamiltonian between determinants of non-orthogonal orbitals are not easily reduced, Slater's simple rules being replaced by those of LSwdin [38], which require the evaluation of vast numbers of cofactors of an overlap matrix. In such calculations a key role is therefore played by algorithms connected with the computation and management of cofactors [39].
Some general comments The principal methods in current use differ mainly in (i) their use of group-theoretical or Slater procedures; (ii) whether or not the orbitals are optimized; and (iii) whether one or more orbital configurations are employed. In the group-theoretical approach, single-configuration VB functions were first used extensively by Goddard [40], who succeeded in formulating conditions for a stationary value of the energy associated with a single VB structure, in the form of a pseudo-eigenvalue equation which could be solved iteratively to determine the optimum orbitals. The resultant 'GVB theory' has been widely used, with considerable success. The multi-structure generalization (still for one orbital configuration) was provided by Gerratt and co-workers [33], who used the name 'spin-coupled VB theory' to indicate that the spin coupling was also optimized (through the coefficients with which different structures were allowed to mix). As BD functions are employed these methods do not lead to wavefunctions with an immediate interpretation in terms of the (WR) structures of classical VB form; and to make contact with Pauling's interpretations an a posteriori transfomation is usually necessary.
Multi-configuration VB theory was developed in an entirely different way by Gallup and his collaborators [34], who employed a large number of configurations as an alternative to optimizing the orbitals. This approach bears some resemblance to that used in Sect.5.3, being based on a group theoretical projection operator; but the projected functions correspond to the top path in the branching diagram and again the 'structures' that result have no immediate connection with those of classical VB theory. On the other hand, the approach developed in Sect.5.3 leads directly to wavefunctions expressed in terms of the structures used by Pauling, Wheland, and their contemporaries; there is no limitation to a single configuration (so ionic structures can also be admitted); and the orbitals may be optimized by straightforward numerical methods. Methods based on the use of determinants, particularly those of Balint-Kurti and co-workers [37], share the advantages of simplicity and flexibility; and in spite of a certain lack of mathematical elegance they also readily admit orbital optimization and the use of multiconfiguration wavefunctions. All the methods referred to have reached a certain level of technical perfection and may be used in ab initio VB calculations on molecules with up to, say, 15 electrons in the 'active space' i.e. outside a 'core'. Which one is employed is therefore largely a question of personal taste.
392 6. SOME ILLUSTRATIVE APPLICATIONS The aim of this Section is to present the results of some typical VB calculations, all made using the method described in Section 5.3. In all cases, the basis chosen was of modest quality (usually gaussian 'double-zeta' for the valence orbitals) in order to facilitate comparison with other calculations reperted in the literature (e.g.'full-CI'), for which use of a more extended basis would not have been feasible. The molecular geometries employed range from equilibrium to virtually complete dissociation. Special emphasis is placed on (i) the validity of a core/valence separation, with and without 'freezing' of the core orbitals; (ii) the quality of a perfect-pairing approximation, where appropriate, and the need for resonance mixing as the geometry changes; (iii) the effect of various constraints during the optimization of the orbitals - and the way in which they affect the qualitative picture of the origin of the bonds. 6.1 T h e w a t e r m o l e c u l e Equilibrium geometry is assumed and a contracted gaussian basis is used [41]. The molecule is dissociated by symmetric stretch, energies being calculated at bond length intervals of A R = 0.2Re up to R = 6.0Re (where dissociation is effectively complete). A standard closed-shell SCF calculation is used for comparison; the ls and 2py (normal to the plane, define the core to be used in the 'frozen core' approximation. The VB calculations are performed with just two covalent structures, these being sufficient (according to the qualitative discussion in Sect.4) to describe dissociation in which the oxygen is left in its triplet ground state. Curve (a) in Fig.6 refers to the frozen core approximation; Curve (b) shows the effect of core o p t i m i z a t i o n - which is evidently significant, as the core returns to its freeatom form. Another significant feature of both curves is the growing importance of the second structure, in which the spin pairing in the bonds is replaced by that between the hydrogen atoms and within the oxygen atom; when ~he bonds are stretched to several times their normal length the structure coefficient ratio approaches 1:2. It is easily verified that the corresponding combination of W R spin eigenfunctions then reproduces the 'top' BD function, in which the spins on the oxygen atom are tripletcoupled (as are those on the hydrogens), the two triplets being coupled to a total spin zero. Examination of the orbitals shows that, as dissociation proceeds, the oxygen hybrids shrink back into pure 2p orbitals and the atom thus reverts to its normal 3p ground state. The 'local' triplet coupling of the hydrogen spins indicates a growing repulsion. These findings verify completely the qualitative ideas current in the 'thirties. The calculations reported so far are based on unconstrained mixing of all valence functions: as a result, the optimized orbitals differ greatly from those pictured by Pauling, which - although usually hybrids - were strictly monocentric in character. The optimized forms resemble more closely the Coulson-Fischer orbitals of Sect.2, being 'distorted' AOs which result in considerably increased overlap in the bond regions. In this general context, such AOs have been referred to as 'overlap-enhanced
393 orbitals' [42]. A calculation in which the mixing is constrained, so that every orbital remains essentially monocentric (except for a small contamination arising from orthogonalization against the core), gives very poor results: the overlap enhancement resulting from free inter-atomic mixing is thus vital to the success of an approximation based on covalent structures alone. On the other hand, it should be possible to retain a description in classical VB language by admitting polar structures, exactly as in refining the Heitler-London function for H2. As indicated in Sect.2, the admission of polar structures can well describe bonding even when 'covalent' structures alone fail to do so. The bonding then results from the strong interaction between structures that differ by a single electron transfer across a bond, and is interpreted pictorially in terms of 'electron hopping'. By including 8 ionic structures of this type, and optimizing the multi-configuration function that results, the situation is completely restored: the resultant energy curve is almost coincident with Curve (b).
0.0
-75 6 -
SCF -75.7
-02
-
Q
-75 8 -
-04
b c
~ -0.6 w
-75.9
rUJ --76.0
(*)
i
-0.8
d
-1.0
-76.1
-76.2
C 'i'"
oo
,
,b
,
2;
"
D/De
s'o
,
Figure 6. PE curves for H20
4;
-,
-1.2
o.o
,b
2,b
D/De
sb
,o
5o
Figure 7. Reduced PE curves for H20.
For comparison purposes, Curve (c) shows the results of an accurate 'coupled-cluster' calculation (obtained using Gaussian 94), while (d) shows those of a full CI calculation [43] (for electrons outside a frozen core), in which about 250 000 Slater determinants were employed. It is well known that the slow convergence of large CI calculations is due to the difficulty of approximating the short-range correlation which inhibits the approach of two electrons are r12 --+ 0. Most of the difference between Curves (b) and (d) must result from this correlation error, which is apparently not strongly dependent on geometry variation. It is revealing to compare potential energy curves by introducing a 'reduced' curve [44]: the reduced PE curve results from a plot of Ered(X) -- E(x) - E(oo) E(c~)- E(1)'
x = R/R~,
(60)
394 which goes from - 1 at equilibrium to 0 at infinity. The reduced curve obtained in this way from Curve (b) is compared with that from the coupled-cluster and full-CI results in Fig.7. Evidently the general shape of the two-structure VB energy curve is in excellent agreement with that obtained from a full-CI study. 6.2 M e t h y l l i t h i u m The nature of the carbon-metal bond is still not well understood: there is, for example, no accepted and unambiguous interpretation of calculated wavefunctions even for the simplest organometallic compound, methyllithium. Pauling pioneered the use of electronegativity scales in the discussion of the ionic character of bonds: lithium is highly electropositive and is expected to form strongly polar, even almost purely ionic, bonds with more electronegative elements. Observations of dipole moments, solubilities, and electric conductivity of solutions all support this conclusion: but theoretical predictions in the literature range from almost purely covalent to almost completely ionic. The disagreement arises from the diversity of the methods of calculation employed and can only be fully resolved by translating Pauling's arguments into their ab initio counterparts. As in the above discussion of the water molecule, the simplest VB procedure would be to fully optimize the orbitals in a perfect-pairing representation of the wave function: but that would obviously preclude any interpretation of the bonding in terms of covalent-ionic resonance. Again, the only way of obtaining a classical VB interpretation of the C - L i bond would be to ' force' the admission of ionic structures by excluding the mixing of carbon and lithium orbitals. When this is done [43], the classical structures that show the carbon-lithium bond as C - Li, C - Li +, and C+Li -, appear with weights* 0.635, 0.376. and-0.011, respectively. The predicted 38~163 ionic character is strikingly close to the 43~163 estimated by Pauling on the basis of his electronegativity values (1.0 and 2.5, respectively) for lithium and carbon. This molecule has been fully discussed [45], both in its equilibrium geometry and over the whole range of distances for breaking of the carbon-lithium bond. Again, as dissociation proceeds the geometry and orbital forms change drastically: the carbon hybrids go from roughly tetrahedral at equilibrium to trigonal planar in the CH3 fragment. The results provide a coherent account of the behaviour of the system, in the language of classical VB theory but with the support of ab initio calculation. 6.3 L i t h i u m fluoride The bond in lithium fluoride is normally regarded as highly ionic, the Pauling electronegativities being roughly 1.0 and 4.0 for Li and F, respectively; and this raises the question of whether the molecule will dissociate into neutral atoms or into ions. As in previous examples, it must be possible to obtain a satisfactory energy curve by using * The appropriate definition, for non-orthogonal structures, is W K -- ~-~L CKMKLCL where MKL is the overlap of structures K, L. The sum of the weights is then unity, but the weights close to zero may become slightly negative.
395 a single covalent structure, provided the orbitals are optimized with no constraints: for the two orbitals of the bond could then delocalize over both centres as in the Coulson-Fischer calculation [5]. But this would offer no simple connection with the ideas of classical VB theory, and it therefore seems worthwhile to impose constraints as in Sect.6.2. Two types of calculation are made: first using a single covalent structure, with two electrons in a fluorine 'lone pair' and two in the bond pair F - L i , free mixing of all valence functions being allowed; secondly, a calculation of 'covalent plus ionic' type (three structures) but with the constraint of no inter-atomic mixing. In the first case, the bonding is formally covalent but the orbitals are free to move from one centre to the other during optimization; in the second case the orbitals remain monocentric and the structures then retain their classical interpretation. The core/valence separation is used throughout, only 4 electrons remaining in the valence shell, but the core function is optimized as in Sect.6.1. The results of the second calculation, with monocentric orbitals, are slightly inferior to those of the first: they can be improved by adding more structures but the results of the three-structure calculation give such a clear picture of the bond breaking that further refinement is unnecessary.
-106.2
-
1.0
(a)
(b)
---------
0.8 -106
4
a) v
0.6 x:
- 106.6
fl) 0.4
hJ
-106.8
J
J -107
0
2
~
~,
0.2
.~
~
R (bohr)
-~
~
~J
1'o
0.0
3
,;,
.~
~
-~
R (bohr)
Figure 8. PE curve (a) and structure weights (b) for LiF Fig.8 shows the energy variation during bond breaking; and also shows how the weights of the three structures change in the process. The results for equilibrium geometry are in good accord with classical expectations: on the basis of his electronegativity scale, Pauling would have predicted a bond with very nearly 80% ionic character. The ab initio value obtained in the present work is 95%, the weights of the three structures being 0.050, 0.950,-0.030 for F - Li, F - Li +, F + Li-, respectively.
396 As the bond is stretched, the covalent structure becomes more important, but the weight of F - Li + remains above 80% until about R - 5.0 bohr: at this point, where dissociation is well advanced, the weights would suggest a 20% probability of separation into singlet-coupled neutral atoms, with an 80% probability of finding two separate ions. Energetically, however, the covalent coupling is favoured: at R - 9 . 0 bohr, where dissociation is virtually complete, the lowest-energy state contains only 1.4% ionic character and describes simply the neutral atoms with singlet coupling between the electrons of their singly occupied valence orbitals. The first excited state, with 98.6% ionic character, represents the pair of ions, whose energy is about 0.24 hartree higher than that of the neutral pair. As in Sect.6.1, comparison with SCF and full-CI curves (the latter, even with frozen ls cores, involving almost 10 million Slater determinants), testifies to the adequacy of the description in terms of three VB structures, the reduced energy curve being in excellent agreement with that from the full-CI calculation. Two states are indicated in Fig.8 to reveal an 'avoided crossing' in which the structure weights for L i - F and Li+F - change abruptly within a very short interval; this is not evident when only one covalent function, based on delocalized orbitals, is employed. 6.4 B e n z e n e a n d its ions
This Chapter could hardly close without reference to resonance in organic chemistry, where its impact has been so enormous. The classic example provided by the benzene molecule was first studied using ab initio VB methods by Gerratt and Raimondi [46], who performed calculations on the 7r-electron system using all five spin-coupled structures (Fig.3) with full optimization of both orbitals and mixing coefficients. Their results gave an impressive demonstration of the fact that, with orbital optimization, a few covalent VB structures could account for almost the whole of the energy obtained from a large CI calculation. With just 6 2p. orbitals as the r-electron basis (inner-shells and a-bonds providing the usual frozen core), the 5 covalent structures reproduced almost perfectly the results of the 'full-CI' calculation which included all 175 covalent and ionic structures: in fact, two Kekul6 structures alone accounted for more than 90% of the 7r-electron correlation energy. The forms of the optimized orbitals were reminiscent of those found by Coulson and Fischer: although strongly localized around each carbon centre, each was polarized towards its neighbours to give a set of overlap-enhanced orbitals (OEOs). Without overlap enhancement the 5 covalent structures alone cannot match even the results of a simple SCF MO calculation. The accuracy of a single-configuration VB description of a typical ~r-electron system, which confirms the validity of the classical picture, is not peculiar to benzene. Similar calculations [47] on the naphthalene molecule yield OEOs of similar form and again give an accurate description of the electronic structure in terms of resonance among a few 'principal structures'. On turning, however, to ionization and attachment processes the one-configuration picture is no longer sufficient, even at a qualitative
397
level. In view of the importance of such processes in organic chemistry, where the ease with which a conjugated molecule can accept or donate a r electron is of great interest in theories of reactivity, it is worth indicating briefly what modifications are necessary. The most important feature of ionization and attachment processes, from the standpoint of VB theory, is the increase in the number of structures that must be included in the wavefunction. Thus, for benzene, the number of covalent structures is 5, all belonging to the configuration with 6 singly-occupied orbitals and differing only in spin-coupling schemes. As already noted, this l-configuration approximation, fundamental to many forms of VB theory, can often give excellent results. But on removing one electron the number of orbital configurations increases; for benzene, for example, there will be s i x - according to which orbital contains the 'hole'. In general, corresponding structures may be indicated by adding a + at the position of the empty orbital and a dot at the position of the orbital whose electron remains unpaired. Thus, in benzene, for each position of the +, the spins may be coupled in 5 linearly independent ways; and this gives 30 structures in all. Similar considerations apply to the negative ion, where 30 structures must be admitted, the + of the positive ion being replaced by a - to indicate the doubly occupied orbital. All such structures are 'covalent' in the sense that no electron-pair link has been replaced by a ( + , - ) pair - as would be the case in forming a conventional ionic structure by an internal electron transfer. To ensure comparability with the neutral molecule calculations, it is therefore necessary to include all 30 covalent structures. A full account of the results of such calculations, for benzene and pyridine, is available elsewhere [48]. The approach of Sect.5.3 is applicable without change and leads to wavefunctions and energies much better than those based on (open-shell) SCF theory. What is more important, however, is the simple interpretation of the wavefunction in terms of a few energetically favoured structures. The weights of these structures appear to govern the chemistry of the ion, the position of the 'dot' being important for radical attack, and the position of the + or - for attack by charged species. There seems to be little doubt that further work along such lines will provide ab initio support for many of the intuitive principles formulated in the Pauling-Wheland era. CONCLUSION It remains only to draw together the main conclusions from this work. They may be stated very briefly as follows: 9 The main concepts and principles used so effectively by Pauling and his contemporaries are just as valid now as they were sixty years ago: chemical bonds may be well described by pairing the spins of the electrons in singly occupied valence orbitals (strong bonds being associated with strong overlap) to obtain a single VB structure; and when more than one structure is intuitively acceptable the description must be extended to admit resonance i.e. the wavefunction must be written as a
398 linear combination of structures corresponding to alternative pairing schemes. 9 Fully ab initio variational calculations, using a wavefunction consisting of VB structures and with optimization of both orbitals and structure coefficients, may be carried out by a variety of methods: these range from the direct 'spin-free' approach, in which only the permutation symmetry of the wavefunction is used (as in the 'preSlater' era), to methods in which the structures (with spin factors included) are expanded over determinants of spin-orbitals. 9 The long-established practice of 'separating' the wavefunction into a product of core and valence functions is entirely satisfactory, provided a strong-orthogonality condition is imposed: the electrons of the core provide an effective field for the valence electrons and vice versa and this permits separate optimization (in rotation) of core and valence functions. In this way VB methodology may be applied where most appropriate, notably to the valence electrons of molecules involved in bond-breaking processes, while simpler approximations may be used to represent the 'passive' groups of electrons (e.g. atomic cores) which undergo only minor changes. The limitations on the calculation are then set by the number of 'active' electrons (those of the VB group) and in this way the VB approach becomes viable for comparatively large molecules. 9 Optimization of the orbitals, no matter how large their mutual overlap may become, is central to any VB calculation based on a small number of structures: orthogonalization of the orbitals removes all the problems of VB theory but at the same time removes the bonds! With free mixing of the basis functions on all centres, a few covalent structures can often give an excellent description of electronic structure of the molecule: the optimal orbitals are then invariably overlap enhanced, the OEO overlaps commonly exceeding 0.9, and ionic structures are often of negligible importance. This is the situation in single-configuration forms of VB theory (e.g. [33]), where all spin coupling schemes may be admitted but only one set of occupied orbitals. 9 Closer contact with VB theory in its classical form can be restored by imposing a constraint of monocentricity on the orbitals, permitting free mixing of basis functions on the same atom but not on different atoms: this is indeed in the spirit of Pauling's original use of hybridization. In this case covalent structures alone are insufficient to give an accurate wavefunction, but this may be remedied by admitting a small number of ionic structures - those which arise from electron transfer between bonded atoms. Each singly ionic structure will have one empty orbital and one doubly-occupied, and will belong to a new configuration with its own set of pairing schemes. For many purposes the use of such constraints is an advantage: the concept of 'covalent-ionic resonance' plays an important descriptive r o l e - and chemistry would be the poorer without it. The limited CI associated with the ionic structures leads to ab initio wavefunctions of similar (or better) quality than those provided
399 by a single configuration of OEOs; but, more importantly, it leads to a transparent interpretation, in classical terms, of what is actually happening in many chemical processes. The few examples in Sect.6 speak for themselves: they show clearly that, in spite of magnificent advances in 'computational chemistry', the use of a handful of VB structures can provide a decent alternative to calculations employing many millions of Slater determinants. In the much quoted words of C. A. Coulson, they offer 'primitive patterns of understanding' that are at the heart of chemistry and not easily found in other ways. REFERENCES [1] Heitler, W. and London, F. (1927). Z. Phys. 44, 455. [2] Pauling, L. and Wilson, E. B. (1935). 'Introduction to Quantum Mechanics'. McGraw-Hill, New York. [3] van Vleck, J. H. and Sherman, A. (1935). Rev. Mod. Phys. 7, 167. [4] Pauling, L. (1928), Chem. Rev. 5, 173 [5] Coulson, C. A. and Fischer, I. (1949). Phil. Mag. 40, 386. [6] Mueller, C. R. and Eyring, H. (1951). J. Chem. Phys. 19, 1495. [7] McWeeny, R. (1993). 'Methods of Molecular Quantum Mechanics 2nd ed.'. Academic, London. [8] Deb, B. M. (ed.) (1981). 'The Force Concept in Chemistry'. Van Nostrand, New York. [9] McWeeny, R. (1951a). J. Chem. Phys. 19, 1614; McWeeny, R. (1951b). J. Chem. Phys. 20, 920; McWeeny, R. (1952). Acta Crystallogr. 5, 463. [10] Mulliken, a. (1955a). J. Chem. Phys. 23, 1833; Mulliken, R. (19555). J. Chem. Phys. 23, 2343. [11] Pauncz, R. (1979). 'Spin Eigenfunctions: Construction and Use'. Plenum. New York. [12] Wigner, E. P. (1959). 'Group Theory and its Application to the Quantum Mechanics of Atomic Spectra'. Academic, New York. [13] Weyl, H. (1956). 'The Theory of Groups and Quantum Mechanics'. Dover, New York (translated from the German edition of 1928); Rumer, G. (1932). Gottingen Nachr. p.377. [14] Pauling, L. (1933). J. Chem. Phys. 1,280. [15] Cooper, I. L. and McWeeny, R. (1966). J. Chem. Phys. 45, 226.
400 [16] Dirac, P. A. M. (1929). Proc. Camb. Phil. Soc. 25, 62. [17] van Vleck, J. H. (1934), Phys. Rev. 45, 405. [18] Wheland, G. W. (1965). 'Resonance in Organic Chemistry'. Elsevier, Amsterdam. [191 Huckel, E. (1930). Z. Phys. 60, 423. [20] Pauling, L. (1931), J. Amer. Chem. Soc. 53, 1367 [21] Lowdin, P.-O. (1950). J. Chem. Phys. 18, 365. [22] Slater, J. C. (1951). J. Chem. Phys. 19, 220. [23] McWeeny, R. (1954). Proc. R. Soc. Lond. A223, 63. [24] McWeeny, a. (1954). Proc. R. Soc. Lond. A233, 306. [25] McWeeny, R. (1955). Proc. R. Soc. Lond. A227, 288. [26] Campion, W. J. and Karplus, M. (1973). Mol. Phys. 25, 921. [27] McWeeny, R. (1997). In 'Quantum Systems in Chemistry and Physics', Proceedings of a European Workshop (ed. Maruani, J.). Kluwer, Dordrecht (in press). [28] McWeeny, R. (1959). Proc. a. Soc. Lond. A253, 242; McWeeny, a. (1960). Rev. Mod. Phys. 32, 335. [29] Serber, a. (1934) Phys. Rev. 45,461; Serber, R. (1934). J. Chem. Phys. 2,697. [30] McWeeny, R. (1988). Int. J. Quantum Chem. 34, 25; McWeeny, R. (1990). Int. J. Quantum Chem. Symp. 24, 733. [31] Zhang, Q. and Li, X. (1989). J. Mol. Struct. 198, 413; Li, J., Wu, W. and Zhang, Q. (1993). Chin. Sci. Bull. 37, 2243; Li, J. and Wu, W. (1993). Theor. Chim. Acta 89, 105. [32] Press, W. H., Flannery, B. P., Teukolsky, S. A., Vetterling, W. T. (1989). 'Numerical Recipes'. Cambridge University Press, New York. [33] Gerratt, J. (1971). Adv. At. Mol. Phys. 7, 141; Cooper, D. L.. Gerratt, J. and Raimondi, M. (1987). Adv. Chem. Phys. 69, 319 [34] Gallup, G. A. (1973). Adv. Quantum Chem. 16, 229; Gallup, G. A., Vance, R. L., Collins, J. R. and Norbeck, J. M. (1982). Adv. Quantum Chem. 16, 229. [35] Raimondi, M., Simonetta, M. and Tantardini, G. F. (1985) Comput. Phys. Rev. 2, 171. [36] Miller, J., Friedman, R., Hurst, R. and Matsen, F. A. (1957) J. Chem. Phys. 27, 1386; Balint-Kurti, G. G. and Karplus, M. (1968). J. Chem. Phys. 50, 478; Raimondi, M., Simonetta, M. and Tantardini, G. F. (1972). J. Chem. Phys. 56, 5091.
401 [37] van Lenthe, J. H. and Balint-Kurti, G. G. (1980) Chem. Phys. Lett. 76, 138; Verbeck, J. and van Lenthe, J. H. (1991). J. Mol. Struct. (Theochem) 229, 115. [38] Lowdin, P.-O. (1955). Phys. Rev. 97,1474. [39] King, H. F., Stanton, R. E., Kim, H., Wyatt, R. E. and Parr, R. G. (1967). J. Chem. Phys. 47,1936. For more recent work see Figari, G. and Magnasco, V. (1985) Mol. Phys. 55, 319; Amovilli, C. (1997) In 'Quantum Systems in Chemistry and Physics', Proceedings of a European Workshop (ed. Maruani, J.). Kluwer, Dordrecht (in press). [40] Goddard, W. A. (1967). Phys. Rev. 157, 73; Phys. Rev. 157, 81; Goddard, W. A. (1968). J. Chem. Phys. 48, 450; J. Chem. Phys. 48, 5377. [41] Dunning, T. H. (1970). J. Chem. Phys. 53, 2823. [42] McWeeny, a. and Jorge, F. E. (1988). J. Mol.Struct. (Theochem) 169, 459. [43] Bendazzoli, G. (1997- private communication) [44] Sosa, C., Noga, J., Purvis, G. D. and Bartlett, R. J. (1988). Chem. Phys. Lett. 153, 139. [45] Wu, W. and McWeeny, R. (1995). J. Mol.Struct. (Theochem) 341,279. [46] Cooper, D, L., Gerratt, J. and Raimondi, M. (1986). Nature 323, 699. [47] Sironi, M., Cooper, D. L., Gerratt, J. and Raimondi, M. (1989). J. Chem. Soc., Chem. Commun., 1989, 675. [48] McWeeny, R. (1996). Chemical Physics 204, 463.
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modern Modelling of the Chemical Bond
403
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Advances in Many-Body Valence-Bond Theory D. J. Klein Texas A&M University-Galveston Galveston, Texas 77553-1675
The area of valence-bond theory as advocated by Linus Pauling has come to be more widely accepted as encompassing interesting novel many-body characteristics, and several many-body techniques have been apparently successfully developed to greatly enhance the theory's computational amenability. 1. INTRODUCTORY SURVEY The idea of valence structures goes back to classical chemistry. And indeed it was then at the heart of chemistry, though the theory was largely of a qualitative nature. There are both brief surveys [1,2] encompassing this history, as well as whole books on the subject [3]. With the advent of quantum mechanics Heitler and London [4] treated the valence bond in H2 in a highly suggestive manner, and shortly thereafter Rumer [5] and more especially Pauling [6,7] extended the theory so as to manifest the general chemical relevance. Ultimately this Valence-Bond (VB) theory as refined to a qualitative resonance-theoretic form linking closely to classical chemical-bonding ideas played a central role in Pauling's masterwork [8] The Nature of the Chemical Bond, which addressed essentially the full variety of chemical structures. But an alternative Molecular-Orbital (MO) approach eclipsed the VB approach, in so far as quantum-chemical research was concerned (say from the 1940s onward for a few decades). There were a variety of reasons for this, including perceived theoretical conceptual difficulties and supposed predictive failures - and perhaps most seriously the MO approach seemed much more computationally amenable. All this history is somewhat more fully discussed (from several points of view) in a couple review articles [2] as well as the first two chapters of [9] Valence-Bond Theory and Chemical Structure. These reviews describe through the period of this eclipse VBtheoretic work which was continued by a (prestigious or perhaps stubborn) minority of researchers (including Daudel, Hartmann, Simpson, Kotani, McWeeny, L6wdin, Matsen, McConnell, Coulson, Oosterhoff, Simonetta, Thorson, Goddard, Gallup, Messemer, & Gerratt though most of these researchers had much diversified other interests too). Indeed these noted reviews in [2] and [9] indicated a sort of renaissance of VB-theoretic interest in the 1980s, and included several points: * Significant progress was made on semiempirical theories especially as regards a
404
variety of applications [10,11]. * Incidentally many of the earlier criticisms of VB theory were themselves critiqued [12], and often found to be somewhat wanting -indeed occasionally even specious. * In the ab initio realm a number of accurate perfect-pairing computations for novel circumstances were performed [13], relations to classical pictures often were explicated [14,15], and yet further highly accurate ab initio computations including "resonance" were [16,17] made feasible and found to be highly accurate, perhaps most notably for benzene. * And finally resonating VB theory was proposed [18] as crucial in understanding hightemperature superconductivity, whereafter the ideas and solution techniques were vastly developed, in a many-body context. Here the focus is to be on many-body VB-theoretic techniques, which now have advanced so incredibly much since the mid-1980s, primarily because of work promoted in connection with high-temperature superconductivity. Of course there were earlier works in this general area, including some by Pauling [19], but such earlier works were fewer, were spread out over a longer period, and were pursued to less high accuracy. Such initial early work is here to be mentioned in context with each type of many-body technique described in later sections. Section 2 sets out some nomenclature and clarifies some oft overlooked points of crucial interest for VB theory. Section 3 identifies crucial aspects of many-body theory, which is here viewed to be such as to allow treatment of systems of sizes including infinite - and some hopefully clarifying discussion is given. Section 4 surveys each of several many-body techniques in the context of VB theory, identifying recent work, while also indicating some of the earlier work on each technique. To reasonably limit the focus here it the survey is primarily of many-body solution techniques as applied to a particular VB model, the covalent-space Pauling-Wheland VB model, represented by the Heisenberg spin Hamiltonian ~' = ,T_,~<jJ~j2si~ j where s k is the spin operator for electron (or site) k and J~jis an exchange parameter of non-ferromagnetic sign (i.e., here J~j>__0). Most simply the non-zero J~j are limited to nearest-neighbor i-j and these are all taken to be identical in value (J~j=J for i-j). There are other representations of this Hamiltonian, e.g., in terms of permutations (on spin indices, or on spin-free electron indices, or perhaps most fundamentally on orbital-site indices). This effective Hamiltonian serves as a relatively "simple" non-trivial example of an explicitly correlated model, serving as a test-bed for different general many-body schemes. But too there are some techniques seemingly specifically designed for this model, and these then (e.g., "classical" molecular field theory) are to be less emphasized here. Many of the methods are believed (and indeed often partially so tested) to be applicable beyond the "simple" Heisenberg model- but even within this limited focus (of the Heisenberg model) the recent literature is immense, so that what is mentioned here is selective and brief.
405
2. VB THEORY: BASES, MODELS, & RESONANCE As a first step some of the nomenclature and framework should be clarified. A VB basis consists of configurations of spin-paired localized orbitals, at least for the overall spin singlet case, while for higher overall spins there are also additional unpaired spins. Typical VB spin-pairing (or Rumer) diagrams in correspondence with the basis elements for the case of six electrons in six orbitals (as for the six singlyoccupied atomic rT-orbitals of benzene) appear as
] ",,.
1.
[
"i"
and display an evident close analogy to the primary classical valence structures for benzene, especially if augmented with the rest of the spin-pairing diagram for the oorbitals. Rumer [5] focused on the singlet case with orbitals singly occupied, it being recognized that the extension to doubly occupied orbitals is trivial. Too Pauling [6] early pointed out the ready extension to the overall spin doublet case, and the approach taken hinted at further extension, e.g., as done in Simonetta's group [20]. But it is to be emphasized that there are other possibilities - anticipated to be most significant in cases when the VB structures as formalized by Rumer, Teller, & Weyl [21] do not correspond so closely to classical VB structures. That is, this Rumer basis is built in terms of orbitals around a cycle, and in many cases a chemical structure may have little to do with a single cycle. More important should be spin-pairing patterns involving pairing between nearer pairs of localized orbitals, independently of a relation to the formal basis of Ruiner et al [21]. Granted a VB basis, a semiempirical model represented on such a basis often is described as a VB model. Of course such models can be represented on any basis spanning the same space, but the semiempirical integral approximations are (usually) motivated from considerations in terms of the VB basis elements. One such case is the VB model of Pauling and Wheland expressed on the covalent space of configurations built from singly-occupied orbitals. As already noted the model so developed turns out to be essentially the Heisenberg spin Hamiltonian ~{, which often is expressed on a basis of what is essentially Slater determinants of atomic orbitals, and novel approaches involving bases of products of "spin-waves" are not uncommon in physics. Working with representations on such alternative bases can lead to wave-functions and results seemingly rather far divorced from classical-chemical connection and interpretation. A variety of VB models arise, with possibly the most natural hierarchy [22] indicated in fig. 1. The hierarchy of models occur in the column on the left and corresponding methods of solution are indicated on the right (though in some cases the methods have been little explored to date and are then identified with a "question mark") - the abbreviation CI refers to "configuration interaction". The Pauling-Wheland
406
nonorthogonal-AO covalent + ionic VB model
complete CI cluster expansions
1 st ~estfiction
complete CI cluster expansions
primitive covalent VB model 1 st
orthogonalization complete CI Anstitze Neel state Green's functions cluster expansions etc.
Pauling-Wheland VB model 2nd
i
os
otion
Pauling-Wheland resonance model
-
y
"
y
complete CI resonance-theory Ansatz
2 nd orthogonalization
Hemdon-Simpson model
complete CI conjugated-circuits theory
3 rd iestrictio n
complete CI resonance Ansatz
nonorthogonal Clar-structure model 3 rd orthogonalization
Hemdon-Hosoya model
complete CI ,resonance Ansatz
Fig. 1 - Hekarchical scheme for VB models and their solutions.
407
model in this figure is the already noted Heisenberg spin Hamiltonian ~- primary historical differences in physics and chemistry being the representation on different bases and application to different systems. The Pauling-Wheland resonance theory model entails a restriction to a subspace with a basis dominated by VB diagrams solely with neighbor spin-pairing (i.e., it is restricted to Kekule structures, which in general cannot all be included in what is usually termed the Rumer basis). In the column on the left of the figure a systematic derivational approach [22] is indicated, with successive steps entailing either: restrictions to ever smaller spaces; or orthogonalizations of natural (initially non-orthogonal) bases for these spaces. It should be noted that the restrictions to a subspace should for greater accuracy fold in the effects of the complementary subspace through higher orders of degenerate perturbation-theoretic considerations. The orthogonalization of the Pauling-Wheland resonance model gives [23,24] the "Herndon-Simpson" model, which also has been termed [24] the "harddimer" model. When supplemented with a suitable wave-function Ansatz this yields [25] that which is known as the "conjugated circuits" model [26], and which has been much studied. The top and bottom models in this sequence really have been rather less studied though. Perhaps too it could be noted that older criticisms concerning nonorthogonality or size-consistency "catastrophes" are connected with certain intermediate derivational steps or models, but such criticisms are often largely irrelevant or misleading, as discussed elsewhere [12,22]. The nomenclature resonating VB theory best should be reserved for treatments in terms of VB states, especially the more chemical corresponding to more local spinpairing patterns. And when such local spin-pairing is made manifest that should be of special note (at least in this review), particularly for the many-body case. Just VB theory (without the adjective resonating) is presumably more general, but engenders an even greater degree of ambiguity in current usage: often this phrase is interpreted as simply indicating the use of a spin-adapted many-electron basis built from spinprojected products of localized orbitals; or sometimes it is interpreted simply to be a method which treats a VB model; or sometimes it merely indicates a suitably locally formulated method of solution. 3. MANY-BODY THEORY It is appropriate too to clarify some nomenclature and aspects of many-body theory. Indeed here (and what seems often to be the usage elsewhere) "many-body" is
applied to a theory if it is applicable in practice to infinite systems. That is, if a theory's computational "difficulty" scales with the number N of electrons as N a, it is to be a many-body theory only if a=0, at least for some standard sorts of systems, say with translational symmetry. Since it is often said that the number of 2-electron integrals f fXa'(1)'Xb'(2)'v(1,2)'Xc(3)'Xcj(4)'dT(2)dT(2)
in molecular SCF theory is -N 4 (the number of atomic orbitals Xj being ~N), it is perhaps worthwhile to consider how one can view SCF theory as a "many-body" theory. First realizing that each Xi has an amplitude that falls exponentially fast away from its atomic
408
center, one sees that the integrals become rapidly very small (and presumably negligible) unless centers a and c are close (whence X~'(1)'X~(1) need not be very small) while also b and d are close. Thence two factors of N are eliminated from the -~N4 integrals. Next rather than dealing with a bare Coulomb interaction, one realizes that the longer electron-electron interactions are going to be largely canceled by the corresponding electron-nucleus and nucleus-nucleus interactions. That is, with the usage of a consequent "dressed" interaction v(1,2) (depending also on a,b,c,d), the quantum mechanical corrections are larger only when the pair of centers a,c is not too far from the pair of centers b,d (it being understood that this is not to preclude an ionic substance for which a classical Coulombic part persists and is to be separately treated in a classical Madelung type of calculation). Thus the number of remaining nonnegligible integrals is -N (with a proportionality "constant" which is larger the greater the desired accuracy, and which best should be changed to assess the degree of accuracy). Next upon invocation of translational symmetry, sets of ~N orbitals become equal, and the last factor of N is eliminated. Finally beyond the integral evaluation there is a Fock-matrix diagonalization step, which for a general M• matrix has a cost -M 3 - but though there are -N orbitals translational symmetry again reduces it to ~N finite-size matrices of which only a finite number at representative wave-vectors are diagonalized (the various quantities of interest being continuous functions of wavevector). Thence the "many-body" character is manifest. As a further clarifying point of discussion it is of value to describe why a common computational technique, namely that of (bare) "configuration interaction", does not qualify as a "many-body" method (within the current definition). Basically the problem has to do with the number ~ of configurations needed for a size-extensive improvement (i.e., an improvement -N in total energy or other size-extensive quantity). As is wellknown the traditional approach to CI via matrix diagonalization has a computational cost -N 3, though if instead of seeking all the eigensolutions only one (say the ground state) is sought, then the cost may be reduced [27] to perhaps almost as little as ~N. Thus the whole question reduces to how an appropriate number of configurations N depends on the number N of electrons. Now wave-functions (like probabilities) are multiplicative in terms of independent circumstances such as excitations in distant parts of a large molecule, polymer, or solid. Thence allowing approximate independence of distant excitations to allow for physically plausible corrections there results an exponential dependence of N on N - which then means CI is very far from being a many-body technique, though it may be useful on sufficiently small systems. Indeed a related point, namely that excited configurations come to dominant in complete CI, was early realized in connection with VB theory (where "excited" configurations might be identified as "ionic" configurations), and this was suggested as some sort of fundamental conceptual problem in VB theory, though really the "same" argument applies to SCF theory. That is, it was not realized that the quality of a local description restricted in a local subspace depends not on the probability that all local regions be so confined, but rather on the probability that the local part of a configuration lie in the local subspace associated to the restricted description independently of the rest of the system. Rather surprisingly the resolution of such misunderstandings was made [28,29] in the mid 1950s in the context of MO theory, though the corresponding misunderstandings in the context of VB theory continued for some time. The well-known non-size-extensivity of
409
CI limited to double-excitations above the SCF wavefunction turns out to have a VB analogue with VB diagrams limited to no more than one long bond beyond what is otherwise a Kekule structure- and such has been suggested a few times some decades back apparently without understanding the defect - though this seems to have had little widespread effect on the acceptance or rejection of VB theory. Anyway global CI on very large systems is not a (viable) many-body technique, nor is finitely excitationlimited CI, while however SCF-based (Moeller-Plesset) perturbation theory is wellknown [29] to be a many-body technique. It is to be emphasized that many-body theory is of interest for finite systems as well as infinite ones. First such techniques enable a greater range of finite systems to be treated. Moreover, with many-body techniques well in hand size-dependent errors in quantum-chemical computations should then too be well in hand. Perhaps a main perceived problem with VB theory was that from early on it was framed in terms of a CI approach amongst the various VB basis states. And the number N of relevant VB structures (beyond a single one for certain cases) can increase exponentially with N. A focus on covalent structures only, or more restrictively on Kekule structures only, just diminishes the exponent in the exponential dependence rather than changing the functional form of the dependence. Thus though the CI approach for finite (molecular) systems has been developed [30-.32] explicitly for the VB approach, it is really something more that is needed to make VB theory more generally viable -i.e., what is (and always has been) needed is many-body VB theory. 4. MANY-BODY TECHNIQUES FOR VB MODELS Now it is appropriate to discuss various possible many-body VB approaches, so many of which have been so actively developed in physics in the last decade. Of especial interest are the approaches when the resonating VB structure of the soughtafter state is evident in the computations. Primarily in application to the PaulingWheland VB model (or equivalently to the Heisenberg spin Hamiltonian, for the circumstance of antiferromagnetically signed exchange) the different approaches are considered separately in the following. 4.1. Configuration Interaction Though it has been emphasized that complete CI (for direct application to the large system limit) is not a "many-body" technique there is a simple way to view something somewhat the same as a many-body scheme. One simply takes the results for VB-model computations for a suitable sequence of computable finite systems and makes some intelligent extrapolation to the desired infinite system. Evidently it would be better nomenclature to describe such a many-body approach as extrapolated CI and it is an important approach, with the (seemingly exponentially) ever increasing computing power available to facilitate ever more nearly complete sequences from which to extrapolate. Indeed the extrapolation step can be sufficiently sophisticatedly developed that the extrapolation step is emphasized - e.g., as with some "cluster and moment" schemes and the "renormalization-group" schemes, to be separately considered later in sections 4.3 and 4.7. Often cyclic boundary conditions are used to
410
make convergence more rapid, a standard ploy since Bonner & Fisher's extrapolations [33] (including extrapolations for temperature-dependent thermodynamic expectations) and it is perhaps worthwhile to mention a simply implementable idea which enhances convergence even further: adjust [34] the boundary conditions for (apparent) maximum rate of convergence. Thus CI is of much value, it incidently treating the interaction between the considered configurations perfectly correct for any directly treatable (isolated) molecular systems which can also be of much interest. As a consequence there are significant developments of various techniques for CI solutions to VB models, using either a Rumer basis [30] or a Young-Yamanouchi basis [31] or (what is in essence) a Slater-determinant basis [32]. Perhaps it should be noted that though the Rumer basis is non-orthogonal (and some researchers have thereby asserted this engenders great difficulty for extensive CI) Ramesesha & Soos [30] have demonstrated (in practice) notable amenability, when a complete Rumer basis is used. To enable treatment of larger systems another idea [35] selects through numerically determined "importance estimates" a subset of the full CI basis, perhaps extrapolating the results as a function of an importance parameter for the acceptance of configurations, or making some perturbative correction for the less significant configurations.
4.2. Many-Body Perturbation Theory
A simple approach for the Heisenberg (or equivalently the Pauling-Wheland VB) model takes the corresponding Ising model as a zero-order description, with Neel-like spins up or down located in so far as possible on neighboring sites. Indeed Hartmann [36] early considered this approach, exhibiting the chemically nicely interpretable analytic results arising in second order, and Malrieu's group [37] has extended and similarly much used this method. Moreover, in the context of interest in high-T superconductivity this approach has further been pushed to higher order [38] to give highly accurate results for extended systems of interest. But [39] there are reservations characterizable in terms of the structural circumstances for which Neel-state or resonating-VB descriptions are more suitable, it being that such "zero-order" descriptions are qualitatively different: * the Neel-like description is best for high-coordination-number systems and little "frustration" (such circumstance being common for many of the 3-dimensional structures of more conventional solid-state physics); whereas * the resonating-VB description is best for low-coordination-number systems which admit many Kekule structures (this case being that of traditional benzenoid chemistry considered by Pauling and Wheland [7,40]). Indeed for the Neel-based theory to work best it is better to have a bipartite system (i.e., a system with two sets of sites all of either set having solely only members from the other set as neighbors). Of course, when there is a question about the adequacy of the zero-order description questions about the (practical) convergence of the perturbation series arises. But for favorable systems these [38] or closely related [41] expansions can now be made through high orders to obtain very accurate results. One can also imagine a perturbation expansion based on the resonating VB limit as zero-order. This has been considered [23,24] for some general circumstance. But for the covalent space for the n-networks of benzenoids the different Kekule structures would all be degenerate, and the degenerate perturbation theory can be neatly
411
described to give rise to a new effective Hamiltonian on this subspace, thereby giving rise to the Pauling-Wheland and Herndon-Simpson models of figure 1. 4.3. Cluster and Moment Methods In these schemes complete computations are made for different fragments of the network and the results for these fragments are incorporated into a systematic extrapolation method. One energy cluster-expansion scheme [42] expresses a system energy for any fragment as a sum over contributions from all subfragments, with these different subfragment contributions obtained by ("MObius") inversion from all fragments up to a certain size. The energy for a large fragment then is approximated as a sum over subfragments truncated at a given size, and it may be shown that the result is accurate up to a corresponding order of a suitable many-body perturbation theory, and moreover quite accurate results are obtainable [43], if sufficiently large subfragments (say of up to 10 sites) are utilized. The idea extended to a degenerate perturbationtheoretic version of this provides a means for deriving models, such as VB models [37,44], as of figure 1. A scheme termed the "connected moments" method [45] makes use of what also may be identified as CI matrix elements for such different subfragments, and likewise yields quite accurate results [43,46]. In using these various schemes (as well as many-body perturbation theory) it is appropriate to follow the total system energy estimates as a function of order of approximation, so as to obtain an estimate of the remnant error. Indeed these methods have somewhat the flavor of "local" perturbation-theoretic methods (though with partial contributions from higher orders). Finally such methods apply for other expectations, and related techniques are [47] standard for the computation of statistical thermodynamic quantities. 4.4. Spin-Waves and Green's Functions Early on this type of approach was developed, and for a few decades appears to haw.= been viewed as the approach of choice, though the preference has become very much less clear, especially with the low-coordination-number (lower-dimensional, sometimes frustrated) systems of interest in the solid-state community during the last decade. Often there is some sort of transformation from the spin-operator representation to a second contest (usually bosonic) creation/annihilation-operator formalism, whence expression in wave-vector space and solution via a conventional looking SCF decoupling follows. Rather frequently however such transformations [48] append a non-physical space to achieve the ordinary creation/annihilation-operator formalism, with which there is nothing formally wrong so far as the representation is concerned. But typically in the (SCF-like) solution phase components of the adjunct non-physical space are mixed into the considered approximate "solutions". A recent type, of such treatment [49] is via so-called "slave bosons", and apparently the approach can be carried through to yield presumably rather accurate results. But often it seems [39] that such methods are Neel-state based and are best applicable to systems with higher coordination number with little "frustration". One rigorous transformation [50] of the spin-Hamiltonian representation to a fermionic creation/annihilation-operator formalism avoids such an auxiliary non-physical space, but is usually restricted to the particular case of a linear chain (or something fairly close to this) if a reasonable looking Hamiltonian is to be obtained (i.e., with limitation to 2- or 3- or 4-particle interactions).
412
Also some other recent treatments [51] rather broadly seem to fall into the present general category.
4.5. Wave-Function Cluster Expansions There are at least three types of cluster expansions, perhaps the most conventional simply being based on an ordinary MO-based SCF solution, on a full space entailing both covalent and ionic structures. Though the wave-function has delocalized orbitals, the expansion is profitably made in a localized framework, at least if treating one of the VB models or one of the Hubbard/PPP models near the VB limitand really such is the point of the so-called Gutzwiller Ansatz [52]. The problem of matrix element evaluation for extended systems turns out to be somewhat challenging with many different ideas for their treatment [53], and a neat systematic approach is via Cizek's [54] coup/ed-c/ustertechnique, which now has been quite successfully used making use [55] of the localized representation for the excitations. Another type of scheme for the purely covalent models is based on a zero-order Neel-state description. Indeed two different such expansions were proposed a few decades back. The first by Hulthen [56] & Kasteleyen [57] (later rediscovered [58]) involves different multiplicatively cluster-expanded weightings for each determinant of atomic orbitals. The second by Vroelant & Daudel [59] (also later discovered several times [60]) is essentially a standard cluster expansion based on a 0-order determinant of atomic orbitals with different spins assigned to the alternant subsets of sites of a bipartite network. Each scheme is recognized to give rise to a type of statistical mechanical problem, with the wave-function overlap corresponding to a stat-mechanical partition function and the wave-function energy expectation corresponding to a statmechanical local correlation-functional derivative of the partition function. Indeed the Hulthen-Kasteleyn Ansatz gives rise [58,61] in the lowest non-trivial order to the ordinary Ising model, so that the problem is susceptible to exact (variational) treatment in 1- and 2-dimensions. For higher orders for the Ansatz of Vroelant & Daudel the coupled-cluster technique again seems quite powerful [62]. An as yet incompletely explored point is [61] that for the lowest non-trivial order of the Hulthen-Kasteleyn Ansatz as applied to the linear chain in a variational format rather directly gives rise to a whole set of orthonormal "excited" cluster-expanded states which may in turn be used as a new (explicitly correlated) basis for the problem. Also a cluster expansion based on the set of Kekule structures is possible [63], and indeed (in the nomenclature used here) evidently yields the first suggested manybody resonating VB solution scheme (earlier many-body approaches seeking solutions to VB models without much attention to the chemically appealing local spin pairing). This scheme in its lowest order with spin pairing constrained to nearest neighbors (i.e., Kekule structures) has now been rather widely studied [64], and extensions beyond nearest-neighbor spin pairing evidently can be made to a limited extent for modest improvement [65] or substantially further for quite high accuracy [66]. Further there are yet some other seemingly exotic wavefunction Ans~tze which might be classified as cluster expansions [67]. These involve phrases such as "flux phase", "spiral phase", and "commensurate flux phase".
413
4.6. Monte-Carlo Computations
This general type of approach in a few different versions has over the last decade become especially powerful, though in most versions it really is (like CI) for finite systems - it is just that significantly larger systems are treatable than via CI so that there is less ambiguity and somewhat higher accuracy in extrapolation to infinite systems is possible. For the case of a nearest-neighbor model ~ on a bipartite network there is a way [68] of assigning a phase to the spin-product basis so that the groundstate is "nodeless" (on this signed basis), and the Monte Carlo approach has a special effac, ious possibility for application, which now has been [68] much developed and successfully used. Another approach [58,66] avoids these (nearest-neighbor and bipartiteness) assumptions and replaces them by a high-order wave-function cluster expansion Ansatz whose relevant matrix elements are treated via a modification of the Metropolis-et-al algorithm [70] - the results achieved being capable of high accuracy. Finally also there are other working Monte Carlo schemes which are free of all the previous assumptions (of nearest-neighborness and bipartiteness and of wave-function Ansatz), such also achieving high accuracy [71].
4.7. Renormalization-Group Techniques
Here too as formulated for the ground state there are different versions of this powerful technique, which for the present type of models seems best formulated in a real-space framework. The earliest version nicely reviewed [72] shortly after its initial introduction [73] renormalizes a local block (or group) of spins which mutually interact with one another so form to a spin doublet for a block which in turn is coupled via say degenerate perturbation-theoretic calculated interactions to its neighboring blocks - the blocks are identified as "new sites" for a renormalized model and the whole process iterated. This renormalizes a nearest-neighbor Heisenberg model to a new nearestneighbor Heisenberg model, and the iteration process can be analytically continued (toward infinity where the renormalized interblock coupling diminishes toward 0). The whole process now [74,75] has been much applied to 2-dimensional networks. The resullts for the simplest (first-order degenerate perturbation-theoretic) form [72,73,74] give variational upper bounds to the ground state but can be made somewhat more accurate [75] with the sacrifice of this bound. But for what seem typically to be even higher accuracy results there is the numerical "density-matrix" renormalization-group scheme [76]. This scheme keeps a notably more complex structure for each of the new sites obtained from a block, so that at each step the renormalized model really needs to be treated anew - thus one does not rigorously arrive at the infinite limit, but the number of original sites encompassed in the computation goes up exponentially with the number of renormalization steps, so that the technique is simply repeated till the persite changes at subsequent steps is driven down comparable to the numerical "noise" in various utilized computer algorithms. This scheme has been found [76,77] to be highly accurate for a variety of correlated 1-dimensional models. 4.8. Miscellany There are a few other solution schemes. There is [78] novel work paying close attention to sets of states incorporating explicit "resonating VB" character. Another simple approach (which many do not acknowledge as a many-body scheme) entails
4]4
Pauling's general idea [8] that just counting Kekule structures is relevant- perhaps even for semiquantitative work, so that there is a large body of chemical work making such enumerations, e.g., reviewed in [79]. Indeed with focus on the many-body case of extended metallic systems Pauling [80] has in recent years continued with this enumeratively based work. Finally mention should be made of the Bethe Anstaz wavefunctions [56,81], which solve ~ (with nearest-neighbor interactions and all such J~j=J) exactly for the linear chain - but this method seems very difficult to extend to higher dimensions or even to general linear chain systems (though investigations continue making more limited extensions- see, e.g., ref. [82]). 5. OVERVIEW AND PROSPECTS Evidently there are now a number of different types of working many-body techniques for the solution of VB models, and amongst these a couple are categorizable as resonating VB techniques. Rather interestingly for many-body computations (as developed so far for models) there appears little to differentiate in terms of computational difficulty between the use of the VB-based or (explicitly correlated) MO-based pictures. This conclusion of appearance contrasts with statements often made even during the last decade, though usually these are made in the context of ab initio work where the requisite computations are more challenging (most especially) in terms of the evaluation and manipulation of integrals. Such past oft proclaimed statement of "conventional wisdom" might say: "Though there may be conceptual advantages to VB theory its computational implementation is much more difficult." Here it is contrarily contended that once comparisons to simple SCF computations are foregone and instead comparisons are made in terms of an explicitly correlated sizeconsistent many-body MO-based theory, computational difficulties in the MO- and VBbased approaches seem more comparable (and perhaps even sometimes the "same" in some fundamental computational sense). Notably the same techniques such as CI, cluster expansion, Monte Carlo, and renormalization-group turn out often to be the methods of choice for either VB or MO pictures. Also it is interesting how Cl-like manipulations are for high accuracy so intimately involved in so many of the techniques: CI extrapolation, clusters and moments methods, wavefunction cluster expansion, and renormalization-group techniques. Interestingly yet further is how many of the techniques are so intimately related to stat-mechanical techniques for the evaluation of partition functions and consequent expectations: many-body perturbation theory, Green's function techniques, cluster and moment methods, wavefunction clusterexpansions, Monte-Carlo approaches, and renormalization-group methods. But with some further elaboration the present view contrasts with yet another occasionally proclaimed type of statement of "conventional wisdom": "In proceeding to ever higher-order terms in the MO- and VB-based theories convergence of the final results should occur (so that general MO-based computational effaciousness occurs)." Here again we take a contrary view. First the parenthetic statement is no logical direct consequence even if the non-parenthetic assertion was to be granted - whether or not
415
the, computational effaciousness turns out to be comparable likely depends on a detailed comparison of methodologies, and likely also depends on the particular systems treated. Particularly if the many-body limit is taken first, then computations via the, two approaches need not converge toward the same limit. E.g., the two zero-order wave-functions may fall into two different "universality" classes, say with different exclusionary types of long-range orders which are not quenched through extension to higher orders. Such ideas are implicit in a number of studies: * in Lieb, Schultz, & Mattis' [83] emphasis of the occurrence of (non-physical) longrange interactions in typical 0-order SCF (model) Hamiltonians; * irk Thorson & coworkers' [84] findings of qualitative distinctions in the second-order reduced density matrices for resonance-theoretic and SCF-type wave-functions; * in Anderson's proposal [85] of the possibility of a novel type of long-range order in a resonating VB description (of the Heisenberg model ground state on the triangular lattice); and * irk the occurrence of a long-range spin-pairing order [86] in resonance-theoretic (cluster-expanded) wave-functions, this ordering being seemingly complementary to Neel-ordering. That is, building up corrections (via the usual many-body techniques) often seems not to be able to shift a wave-function between universality classes associated to different long-range orders. But of course even for finite systems there also are questions of rates of convergence and ease of interpretation of results - and in this case it would seem that whatever preferences there might be would depend on the particular system under consideration. It seems that there is much promise in and for VB theory, even for many-body VB theory. There are very many techniques, results, and ideas now more-so in the physics literature and much of this should be applicable in a more chemical context. Only an outline of the extensive literature on techniques has been attempted here, and it seems likely that some of these techniques are susceptible to much further improvement. But also it should be mentioned that there are extensive recent theorematic results, e.g., as a sample of more than a hundred pages worth may be noted in [87], some of which may ultimately prove of chemical or materials-science interest. There are [88] other (somewhat differently flavored and more mathematically detailed) reviews of the area from the point of view of physics. Surely VB theory is an important area pioneered by Linus Pauling. Evidently this legacy is becoming more widely appreciated - and a great number of fundamental results and powerful computational methods have been developed within the last decade or so. Bringing chemical relevance out of this flood of work seems an exciting prospect. Presumably there soon will be many chemical and physical applications, such as I think would especially please Pauling. Acknowledgement is made to the Welch Foundation of Houston, Texas.
416 REFERENCES
[ 1] H. M. Leicester, J. Chem. Ed. 36 (1959) 328. D. F. Larder, J. Chem. Ed. 44 (1967) 661. L. Pauling, J. Chem. Ed. 61 (1984) 201. [2] S. R. La Paglia, in chap. 1 of Introductory Quantum Chemistry (Harper & Row, New York, 1971). D. J. Klein & N. Trinajstid, J. Chem. Ed. 67 (1990) 633. [3] C. A. Russell, The History of Valency (University Press, Leicester, 1971). A. N. Stranges, Electrons and Valence, (Texas A&M University Press, College Station, 1982). [4] W. Heitler & F. London, Zeit. Phys. 44 (1927) 455. [5] G. Rumer, G6ttinger Nach. Ges. Wiss. (1932) 337. [6] L. Pauling, J. Chem. Phys. 1 (1933) 280. [7] L. Pauling & G. W. Wheland, J. Chem. Phys.1 (1933) 362. [8] L. Pauling, The Nature of the Chemical Bond (Cornell University Press, Ithaca, 3rd Edn.,1960). [9] Valence-Bond Theory & Chemical Structure, ed. D. J. Klein & N. Trinajstid (Elsevier, Amsterdam, 1990). [ 10] R. D. Harcourt, Qualitative Valence Bond Descriptions of Electron-Rich Molecules (Springer-Verlag, Berlin, 1982). N. D. Epiotis, Unified Valence Bond Theory of Electronic Structure -Applications (Springer-Verlag, Berlin, 1983). S. S. Shaik, Progr. Phys. Org. Chem. 15 (1985) 197. A. Pross, Acc. Chem. Res. 18 (1985) 212. [11] A. A. Ovchinnikov & V. O. Cheranovskii, Teor. & Eksp. Khim. 16 (1979) 147. D. J. Klein, C. J. Nelin, S. A. Alexander, & F. A. Matsen, J. Chem. Phys. 77 (1982) 3101. A. S. Shawali, C. Parkanyi, & W. C. Herndon, J. Org. Chem. 47 (1982) 734. J. P. Malrieu & D. Maynau, J. Am. Chem. Soc. 104 (1982) 3021. Y. Pipeng, Kexue Tongbao 27 (1982) 961. D. Maynau, M. Said, & J. P. Malrieu, J. Am. Chem. Soc. 105 (1983) 75. G. E. Hite, A. Metropoulos, D. J. Klein, T. G. Schmalz, & W. A. Seitz, Theor. Chim. Acta 69 (1986) 369. Z. G. Soos & G. W. Hayden, Mol. Cryst. & Liq. Cryst. 160 (1988) 421. S. Lee, J. Am. Chem. Soc. 90 (1989) 2732. Y. Pipeng, Theor. Chim. Acta 77 (1990) 213. H. Zhu & Y. Jiang, Chem. Phys. Lett. 193 (1992) 446. S. Li & Y. Jiang, J. Am. Chem. Soc. 117 (1995) 8401. [12] D. J. Klein, Pure & Appl. Chem. 55 (1982) 299. [ 13] R. P. Messemer, P. A. Schultz, R. C. Tatar, & H. J. Freund, Chem. Phys. Lett. 126 (1986) 176. M. H. McAdon & W. A. Goddard III, Phys. Rev. J. Phys. Chem. 91 (1987) 2607. P. C. Hiberty & D. L. Cooper, J. Mol. Str. 169 (1988) 437. C. H. Patterson & R. P. Messemer, J. Am. Chem. Spoc. 111 (1989) 8059.
417 [14] R. D. Harcourt, J. Am. Chem. Soc. 100 (1978) 8060. S. Kuwajima, J. Chem. Phys. 74 (1981) 6342. S. S. Shaik, Nouv. J. Chim. 6 (1982) 159. B. Kirtman & W. E. Palke, Croat. Chim. Acta 57 (1984) 1247. P. C. Hiberty & G. Ohanessian, Intl. J. Quantum Chem. 27 (1985) 259. P. Karafiloglou & J. P. Malrieu, Chem. Phys. 104 (1986) 383. R. D. Harcourt, F. L. Skrezenek, R. M. Wilson, & R. H. Flegg, J. Chem. Soc., Faraday Trans. 2 (1986) 495. G. Sini, G. Ohanessian, P. C. Hiberty, & S. S. Shaik, J. Am. Chem. Soc. 112 (1990) 1407. [15] M. Said, D. Maynau, J. P. Malrieu, & M. A. Garcia-Bach, J. Am. Chem. Soc. 106 (1984) 571. [ 16] D. L. Cooper, J. Gerratt, & M. Raimondi, Nature 323 (1986) 699. [17] J. Gerratt, D. L. Cooper, & M. Raimondi, pages 287-350 in ref. [9]. [18] P. W. Anderson, Science 235 (1987) 1196. [19] L. Pauling, Nature 161 (1948) 1019. L. Pauling, Proc. Roy. Soc. (London) A196 (1949) 343. L. Pauling, chap. 11 of ref. [8]. [20] M. Simonetta, E. Gianinetti, & I. Vandoni, J. Chem. Phys. 48 (1968) 1579. [21 ] G. Ruiner, E. Teller, & H. Weyl, G6tt. Nach. ges. Wiss. (1932) 499. [22] D. J. Klein, Topics Curr. Chem.153 (1990) 59. [23] D. J. Klein & N. Trinajsti6, Pure & Appl. Chem. 61 (1989) 2107. [24] D. S. Rokhsar & S. A. Kivelson, Phys. Rev. Lett. 61 (1988) 2376. S. A. Kivelson, Phys. Rev. B 39 (1989) 259. [25] L. J. Schaad & B. A. Hess, Jr., Pure & Appl Chem. 54 (1982) 1097. [26] W. C. Hemdon, J. Am. Chem. Soc. 95 (1973) 2404. W. C. Hemdon, Thermochimica Acta 8 (1974) 225. M. Randid, Tetrahedron 33 (1977) 1905. M. Randid, J. Am. Chem. Soc. 99 (1977) 444. [27] E. R. Davidson, J. Comp. Phys. 17 (1975) 87. [28] H. Bethe, Phys. 103 (1955) 1353. R. Brout, Phys. Rev. 111 (1958) 1324. H. Primas, pages 45-74 in Modern Quantum Chemistry I, ed. O Sinanoglu (Academic Press, New York, 1965). [29] J. Goldstone, Proc. Roy. Soc. (London) A 239 (1957) 267. J. Hubbard, Proc. Roy. Soc. (London) A 240 (1957) 539. [30] S. Ramasesha & Z. G. Soos, Intl. J. Quantum Chem. 25 (1984) 1003. [31] S. A. Alexander & T. G. Schmalz, J. Am. Chem. Soc. 109 (1987) 6933. [32] C. E. Dagotto & A. Moreo, Phys. Rev. B 38 (1988) 5087. J. E. Hirsch, S. Tang, E. Loh, Jr., & D. J. Scalapino, Phys. Rev. Lett. 60 (1988) 1688. D. Poilblanc, H. J. Schulz, & T. Ziman, Phys. Rev. B 46 (1992) 6435. [33] J. C. Bonner & M. E. Fisher, Phys. Rev. A 135 (1964) 640. [34] C. Gros, Zeit. Phys. B 86 (1992) 359. M. Vekic & S. R. White, Phys. Rev. Lett. 71 (1993) 4283.
418 [35] H. D. Raedt & W. vonder Linden, Phys. Rev. B 45 (1992) 8787. J. Riera & N. Laouini, Phys. Rev. B 48 (1993) 15346. N. Guihery, N. B. Amor, D. Maynau, & J.-P. Malrieu, J. Chem. Phys. 1D4 (1996) 3701. N. A. Modine & E. Kaxiras, Phys. Rev. B 53 (1996) 2546. V. A. Kashumikov, Phys. Rev. B 53 (1996) 5932. [36] H. Hartmann, Zeit. Naturforschung A 2 (1947) 259. [37] D. Maynau, Ph. Durand, J. P. Dauday, & J.-P. Malreiu, Phys. Rev. A 28 (1983) 3193. [38] D. Huse, Phys. Rev. B 37 (1988) 2380. R. R. P. Singh, Phys. Rev. B 39 (1989) 9760. M. P. Gelfand, R. R. P. Singh, & P. A. Huse, J. Stat. Phus. 59 (1990) 1093. W. H. Zheng, J. Oitmaa, & C. J. Hamer, Phys. Rev. B 43 (1991) 8321. M. Kim & J. Hong, Phys. Rev. B 44 (1991) 6803. [39] D. J. Klein, S. A. Alexander, W. A. Seitz, T. G. Schmalz, & G. E. Hite, Theor. Chim. Acta 69 (1986) 393. [40] G. W. Wheland, Resonance in Organic Chemistry (John Wiley & Sons, New York, 1955). [41] K. W. Becker, H. Won, & P. Fulde, Zeit. Phys. B 75 (1989) 335. M. Kim & J. Hong, Phys. Rev. B 44 (1991) 6803. [42] D. J. Klein, Intl. J. Quantum Chem. $20 (1986) 153. [43] J. Wang, Phys. Rev. B 45 (1992) 2282. Z. Weihong, J. Oitmaa, & C. J. Hamer, Phys. Rev. B 52 (1995) 10278. [44] R. D. Poshusta & D. J. Klein, Phys. Rev. Lett. 48 (1982) 1555. R. D. Poshusta, T. G. Schmalz, & D. J. Klein, Mol. Phys. 66 (1989) 317. [45] J. Cioslowski, Phys. Rev. Lett. 58 (1987) 83. J. Cioslowski, Phys. Rev. 36 (1987) 374. C. J. Morningstar, Phys. Rev. D 46 (1992) 824. [46] J. Cioslowski, Chem. Phys. Lett. 134 (1987) 507. J. Cioslowski, Commun. Math. Chem (MATCH) 22 (1987) 245. K. C. Lee & C. R. Lo, J. Phys. C 6 (1994) 7075. M. J. Tomlinson & L. C. L. Hollenberg, Phys. Rev. B 50 (1994) 1275. [47] J. W. Essam & M. E. Fisher, Rev. Mod. Phys. 42 (1970) 271. C. Domb, pages 1-94 in Phase Transitions & Critical Phenomena III, ed. C. Domb & M. S. Green (Academic Press, New York, 1974). [48] P. W. Anderson, Phys. Rev. 86 (1952) 694. R. Kubo, Phys. Rev. 87 (1952) 568. [49] G. Kotlar & A. E. Ruckenstein, Phys. Rev. Lett. 57 (1986) 1362. T. Li, P. W61fe, & P. Hirschfeld, Phys. Rev. B 40 (1989) 6817. M. Lavagna, Intl. J. Mod. Phys. B 6 (1991) 885. several articles in Physica 199-200 (1994) by M. F. Hundley et al, S. Doniach et al, T. Takabatke et al, & P. S. Riseforough on pages 443,450, 457, & 466. [50] S. Rodriguez, Phys. Rev. 116 (1959) 1474. Z. G. Soos, J. Chem. Phys. 43 (1965) 1121. [51 ] A. P. Arovas & A. Auerbach, Phus. Rev. B 38 (1988) 316. G. G. Batrouni & R. T. Scalettar, Phys. Rev. B 42 (1990) 2282. E. Y. Loh, J. E. Gubernatis, R. T. Scalettar, S. R. White, D. J. Scalapino, & R. L. Sugar, Phys. Rev. B 41 (1990) 9301.
419 [52] M. C. Gutzwiller, Phys. Rev. 134 A (1964) 993. M. C. Gutzwiller, Phys. Rev. 137 A (1965) 1726. [53] G. Stollhoff & P. Fulde, Zeit. Phys. 26 (1977) 257.1. P. Joyes, Phys. Rev. B 26 (1982) 6307. F. Gebhard & D. Vollhardt, Phys. Rev. B 38 (1988) 6911. M.-B. Lepetit, B. Oujia, J.-P. Malrieu, & D. Maynau, Phys. Rev. A 39 (1989) 3274. J. Q. G. Wang, S. Fantoni, E. Tosatti, & L. Yu, Phys. Rev. B 46 (1992) 8894. [54] J. Cizek, J. Chem. Phys. 45 (1966) 4256. [55] M. Roger & J. H. Hetherington, Europhys. Lett. 11 (1990) 255. R. F. Bishop, Theor. Chim. Acta 80 (1991) 95. C. F. Lo, E. Manousakis,& Y. L. Wang, Phys. Lett. A 156 (1991)42. L. Petit & M. Roger, Phys. Rev. B 49 (1994) 3453. [56] L. Hulthen, Arkiv Mat. Astron. Fys. A 26, #11 (1938) 1. [57] P. W. Kasteleyn, Physica 28 (1952) 104. [58] D. A. Huse & V. Elser, Phys. Rev. Lett. 60 (1988) 2531. [59] C. Vroelant & R. Daudel, Bull. Soc. Chim. 16 (1949) 36. [60] I. Nebanzahl, Phys. Rev. 177 (1969) 1001. R. R. Bartowski, Phys. Rev. B 55 (1972) 4536. D. J. Klein, J. Chem. Phys.64 (1976) 4868. M. A. Suzuki, J. Stat. Phys. 43 (1986) 883. [61 ] M. A. Garcia-Bach & D. J. Klein, J. Phys. A 29 (1996) 103. [62] R. F. Bishop, J. B. Parkinson, & Y. Xian, Phys. Rev. B 44 (1991) 9425. F. E. Harris, Phys. Rev. B 47 (1993) 7903. R. F. Bishop, R. G. Hale, & Y. Xian, Phys. Rev. Lett. 73 (1994) 3157. [63] D. J. Klein, Phys. Rev. B 19 (1979) 870. [64] D. J. Klein, T. G. Schmalz, G. E. Hite, G. E. Hite, A. Metropoulos, & W. A. Seitz, Chem. Phys. Lett. 120 (1985) 367. T. Oguchi, H. Nishimori, & Y. Taguchi, J. Phys. Soc. Jpn. 55(1986) 323. S. Kivelson, D. Rokhsar, & J. Sethna, Phys. Rev. B 35 (1987) 8865. B. Sutherland, Phys. Rev. B 37 (1988) 3786. B. Sutherland, Phys. Rev. B 38 (1988) 7192. D. S. Rokhsar & Kivelson, Phys. Rev. Lett. 61 (1988) 2376. S. Sachdev, Phys. Rev. B 40 (1989) 5204. N. Read & S. Sachdev, Nucl. Phys. B 316 (1989) 609. T. Blum & Y. Shair, J. Stat. Phys. 59 (1990) 333. [65] D. J. Klein & M. A. Garcia-Bach, Phys. Rev. B 19 (1979) 877. G. Baskaran, Z. Zou, & P. W. Anderson, Solid State Commun. 159 (1987) 973. M. A. Garcia-Bach, A. Penaranda, & D. J. Klein, Phys. Rev. B45 (1992) 10891. C. Zeng & J. B. Parkinson, Phys. Rev. B 51 (1995) 11609. [66] S. Liang, N. Doucet, & P. W. Anderson, Phys. Rev. Lett. 61 (1988) 365. [67] I. Affieck & J. B. Marston, Phys. Rev. B 37 (1988) 316. B. Shraiman & E. Siggia, Phys. Rev. Lett. 62 (1989) 1564. V. Kalmeyer & R. B. Laughlin, Phys. Rev. Lett. 59 (1987) 2095. R. B. Laughlin, Phys. Rev. Lett.. 60 (1988) 2677. X. Wen, F. Wilczek, & A. Zee, Phys. Rev. B 39 (1989) 11413.
420 P. W. Anderson, B. S. Shastry, & D. Hristopulos, Phys. Rev. B 40 (1989) 8939. [68] W. Marshall, Proc. Roy. Soc. (London) A 232 (1955) 48. [69] T. Bames & E. S. Swanson, Phys. Rev. B 37 (1988) 9405. T. Barnes, D. Kotchan, & E. S. Swanson, Phys. Rev. B 39 (1989) 4357. J. Carlson, Phys. Rev. B 40 (1989) 846. K. J. Runge, Phys. Rev. B 45 (1992) 12292. [70] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. M. Teller, & E. Teller, J. Chem. Phys. 21 (1953) 1087. [71] M. Gros, E. Sanchez-Velasco, & E. Siggia, Phys. Rev. B 34 (1987) 2484. J. D. Reger & A. P. Young, Phys. Rev. B 37 (1988) 5978. E. Manousakis & R. Salvador, Phys. Rev. B 39 (1989) 575. H. DeRaedt & W. vonder Linden, Phys. Rev. B 45 (1992) 8787. N. Trivedi & C. M. Ceperley, Phys. Rev. B 41 (1990) 4552. M. Vekic & S. R. White, Phys. Rev. B 47 (1993) 16131. [72] W. J. Caspers, Phys. Rep. 63 (1980) 223. [73] H. P. van de Braak, W. J. Caspers, & M. W. M. Willemse, Phys. Lett. A 67 (1978) 147. [74] D. C. Mattis & C. Y. Pan, Phys. Rev. Lett. 61 (1988) 463 & 2279. H. Q. Lin & C. Y. Pan, J. Phys. C 8 (1988) 1415. H. Q. Lin & D. C. Campbell, Phys. Rev. Lett. 69 (1989) 2415. [75] T. P. Zivkovic, B. J. Sandleback, T. G. Schmalz, & D. J. Klein, Phys. Rev. B 41 (1990) 2249. T. G. Schmalz & D. J. Klein, Croat. Chem. Acta 66 (1993) 185. V. O. Cheranovski, Y. G. Schmalz, & D. J. Klein, J. Chem. Phys. 101 (1995) 5841. [76] S. R. White, Phys. Rev. Lett. 69 (1992) 2863. S. R. White, Phys. Rev. B 48 (1993) 10345. [77] S. R. White & D. A. Huse, Phys. Rev. B 48 (1993) 3844. C. C. Yu & S. R. White, Phys. Rev. Lett. 71 (1993) 3866. R. M. Noack, S. R. White, & D. J. Scalapino, Phys. Rev. Lett. 73 (1994) 882. U. Schollwock & T. Jolicoeur, Europhys. Lett. 30 (1995) 493. S. J. Qin, S. D. Liang, Z. B. Su, & L. Yu, Phys. Rev. B 52 (1995) 5475. A. Sikkema & I. Affieck, Phys. Rev. B 52 (1995) 10207. W. Wang, S. Qin, Z. Y. Yu, L. Yu, & Z. Su, Phys. Rev. B 53 (1996) 40. H. Otsuka, Phys. Rev. B 53 (1996) 14004. L. Chen & S. Moukouri, Phys. Rev. B 53 (1996) 1866. [78] J. T. Chayes, L. Chayes, & S. A. Kivelson, Commun. Math. Phys. 123 (1989) 53. M. Karbach, K.-H. Mtitter, P. Ueberholz, & H. Kr6ger, Phys. Rev. B 48 (1993) 13666. [79] S. J. Cyvin & I. Gutman, Kekule Structures in Benzenoid Hydrocarbons (Springer-Verlag, Berlin, 1988). P. John & H. Sachs, Top. Curr. Chem. 153 (1990) 145. R. S. Chen, S. J. Cyvin, B. N. Cyvin, J. Brounvoll, & D. J. Klein,Top. Curt. Chem. 153 (1990) 227 [80] L. Pauling, J. Solid St. Chem. 54 (1984) 297. B. Kamb & L. Pauling, Proc. Natl. Acad. Sci. USA 82 (1985) 8284. L. Pauling & B. Kamb, Proc. Natl. Acad. Sci. USA 82 (1985) 8286.
421 [81] H. Bethe, Zeit. Phys. 71 (1935) 205. C. N. Yang & C. P. Yang, Phys. Rev. 150 (1966) 221. [82] B. Davies, O. Foda, M. Jimbo, T. Miwa, & A. Nakayashiki, Commun. Math. Phys. 151 (1993) 89. G. Jiittner & B. D. D6rgel, J. Phys. A 26 (1993) 3105. Y. Yamada, J. Stat. Phys. 82 (1996) 51. J. Cizek & P. Bracken, Phys. Rev. Lett. 77 (1996) 211. [83] E. H. Lieb, T. D. Schultz, & D. C. Mattis, Ann. Phys. (NY) 16 (1961) 407. [84] J. H. Choi & W. Thorson, J. Chem. Phys. 57 (1972) 252. W. R. Thorson, J. H. Choi, & R. G. Hake, Intl. J. Quantum Chem. S 1 (1967) 487. [85] P. W. Anderson, Mat. Res. Bull. 8 (1973) 153. P. W. Anderson & Fazekas, Phil. Mag. 30 (1974) 423. [86] D. J. Klein, T. P. Zivkovic, & R. Valenti, Phys. Rev. B 43 (1991) 723. [87] T. Kennedy, E. H. Lieb, & B. S. Shastry, J. Stat. Phys. 53 (1988) 1019. M. Fannes, B. Nachtergale, & R. F. Werner, Commun. Math. Phys. 144 (1992) 443. T. Korea & H. Tasaki, J. Stat. Phys. 76 (1994) 745. [88] E. Manousakis, Rev. Mod. Phys. 63 (1991) 1. T. Barnes, Intl. J. Mod. Phys. C 2 (1991) 659. G. Senatore & N. H. March, Rev. Mod. Phys. 66 (1994) 445. E. Dagotto, Rev. Mod. Phys. 66 (1994) 763.
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
423
Pauling's Legacy: Modern Modelling of the Chemical Bond
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Ab Initio Valence Bond Description of Diatomic Dications"
Harold Basch', Pinchas Aped', Shmaryahu Hoz" and Moshe Goldberg b "Department of Chemistry, Bar llan University, Ramat Gan 52100, Israel bResearch & Development Directorate, Ministry of Defense Hakiryah, Tel Aviv 61909, Israel ABSTRACT
The electronic structure description of the He22+, NF 2+ and
022+ diatomic
dications has
been explored using ab initio multi-structure valence bond self-consistent field theory in an extended atomic+polarization gaussian basis set. The ground state wave functions are expressed as linear combinations of covalent and ionic bonding forms. In the formally doubly-bonded NF 2+ and triply-bonded O22+, each of the covalent and ionic bonding forms has a multi-configuration expansion. In all three dications, the covalent set by itself has all the qualitative bonding and barrier features of the full ground state curve, superimposed on the coulomb repulsion between atomic monocations. Nonetheless, covalent-ionic mixing is substantial and required for quantitative accuracy. With these properties, the diatomic dications seem to be well described as an ordinary chemical bond plus electrostatic repulsion; the approach taken by Pauling.
Dedicated to Linus Pauling
424 1. Introduction
Diatomic dications are esoteric, high energy molecules which, in the best of cases, are diflficult to characterize experimentally. Already in 1933 Pauling(~) described the energy dissociation curve of the simple dication He22§ using a straightforward valence bond (VB) mixing of a covalent and two ionic configurations. The long range (internuclear distance R> 1.3A) part of the energy interaction curve was found to fit the point charge electrostatic repulsion interaction (--l/R) inherent to the He+:He § covalent configuration. The existence of a metastable energy minimum at short range (R~--0.75A) behind a potential barrier hindering dissociation was attributed to resonant covalent-ionic (He~
2§ and He2+H~ mixing. The detection of He22§ has been reported experimentallyr
although
no spectroscopic constants have been obtained. However, consistent with his general description of the electron pair (single) bond (4), Pauling considered the helium diatomic dication to be fully described as an ordinary chemical bond plus electrostatic repulsion. Hurleyr
adopted this picture of diatomic dications and predicted the energy interaction
curves and spectroscopic constants of a number of them, both in the ground and electronically excited states, by scaling the potential curves of the isoelectronic neutral species and adding the electrostatic repulsion. One source of fascination for the dications has been the suggestion to use He22§ for example, as a source of propulsion energyc~ This application is based on the combined thermodynamic instability of the equilibrium geometry, showing a large exothermic dissociation energy, and kinetic metastability due to a barrier to dissociation. That this unusually shaped energy dissociation curve, showing a barrier and exothermicity, can be described as a superposition of an ordinary chemical bond curve plus electrostatic repulsion has been emphasized by Jonathan, et al~s), and Senekowitsch and co-workers ~176 The latter workers also found, as had Hurley~6), that the scaling procedure using the isoelectronic neutral diatomics generally gives dication energy curves that do not agree quantitatively with experiment or the highest level ab initio curves. Interestingly enough, the addition of a mutual cation-cation polarization potential(l~ to the coulomb repulsion contribution brings the final curves into near-perfect agreement with the best available property values. An alternative model description of dication energy curves has been given by several groups r 17). Based directly on the VB method, this model attributes the origin of the barrier in the ground state energy curve of diatomic dications AB 2§ to an avoided crossing between a purely repulsive covalent
425 (A§ +) energy curve and an attractive (A2+B~ curve (with B more electronegative than A), where the attraction is due to the strong polarization interaction between A 2§ and B. This model of interacting non-orthogonal VB structures has been quantified within a semiempirical framework by Radom and co-workers (~4'15). The major criticism of the avoided crossing model at long range is that it seems to require an unphysically large coupling between asymptotically well-separated covalent and ionic structure curves to accurately locate the energy barrier to dissociation. However, this claim has been refuted ~
At short range, the avoided crossing model goes into less detail, but it seems to imply a
total lack of conventional covalent (A§ +) bonding. An attractive component in the shorter R region superimposed on the coulomb repulsive curve for the A§ § covalent interaction in diatomic dications within a VB framework has also been observed in ab initio calculations Os'~9~. The attractive component could be interpreted as an actual bonding interaction or as a polarization induced stabilization. Given the tendency to short equilibrium internuclear distances for dications, where the polarization interaction would be severely damped, the former mechanism is likely the more dominant. This returns us to the original Pauling VB dication model ~ which has a (weakly bound) covalent (A§ § configuration interacting strongly with the higher energy ionic A2+B~ configuration having a minimum in the region of the ground state energy minimum. This general situation has also been found in the VB description of a number of (single) bond energy dissociation curves in neutral species which involve at least one electronegative dissociating atom like fluorine ~2~
In such cases, the covalent-ionic resonance interaction determines
the binding energy. In other cases, the covalent curve itself can be strongly bound. Examples of both kinds of binding situations have been discussed in dications ~24"25). A covalent-ionic curve crossing is not unique to dications; it has also been found in the SiH3-F energy curve, for example, but without a resultant barrier~2~ In this paper we will explore the forms of the VB covalent and ionic curves for Hez2+, O22+, and NF 2§ to see how they fit the models of dication binding that have been proposed, and where they fit in the spectrum of parameters that are used to classify bonding systems according to these models. The method of choice for this analysis is ab mitio multi-structure VB self consistent field (MSVBSCF) theory using nonorthogonal orbitals (27"29). In this method each electronic configuration, defined as a particular assignment of electrons to VB orbitals, is expressed as a set of one or more VB structures which differ from each other in their spin coupling. The VB structures are written as linear
426 combinations of determinental functions of the spin-orbitals. The nonorthogonal orbitals, which are usually atom or fragment localized, are expanded in the familiar atom-centered gaussian basis sets. The orbitals are divided into active and passive sets. The active set of orbitals can have different occupancies in different configurations. The passive orbitals have a fixed occupancy of two electrons each in all the configurations. All the VB calculations were carried out using the TURTLE set of computer codes, obtained from Dr. J. H. van Lenthe of Utrecht University, Holland ~176 The results on He22+ have been partly reported previously~19), but those for 022+ and NF 2+ are completely new here. 2. He2 2+
As noted above, the helium diatomic dication has been well studied ~
The standard
three configuration VB wave function for the "singly-bonded" (He,-Heb) 2§ consists of a linear combination of configurations (1) to (3): (He.+:Heb+)
[Is.(I)lSb(2) + lSb(1)ls.(2)]
(1)
(He.2+)(Heb ~)
1s~'(1)lsb'(2)
(2)
(He,~ (I-Ieb2+)
1s,'( 1) 1s,'(2)
(3)
where only the unnormalized spatial part of (1) is shown. At the simplest theory level the same set of atom localized orbitals are used for both the covalent [(1)] and ionic or electron transfer [(2) and (3)] configurations, giving ls,'=ls, and lsb'=lsb. This will be called level I. A significant increase in accuracy can be achieved if the covalent and ionic structures have different optimized VB orbitals. This will be called level II, where l s,' and l sb' are allowed to VBSCF optimize as different
orbitals
than
ls,
and
1~
respectively,
to
suit
their
"ionic"
charge
distributiont~9~'29'3~'33'34). The tailored ionic orbitals have been called, "breathing orbitals "t29~). The rational for using tailored orbitals for different VB structures is that the spatial extent of an orbital in an ionic configuration can be substantially different than in a covalent configuration. The use of different orbitals in different configurations is equivalent to a limited configuration interaction calculation with an expanded space of orbitals used to construct more configurations. The nonorthogonal property of the VB orbitals allows the tailored orbital optimization as a natural part of the VBSCF process; something that would be much more difficult using orthogonal molecular orbitals (MO). The He atom ls,, lsb, ls,' and lsb' orbitals were expanded using the (10'6 p) gaussian primitive basis set optimized for the He(3P) state from Partridge's
427 compilation~
The atom basis set was contracted [61111/3111] and augmented by two single
primitive d-type functions (5 components) with exponents of 3.500 and 1.000 O9). Since the VB orbitals are atom localized they don't individually obey inversion symmetry; the homonuclear diatomic point group symmetry is obtained by having Is, and l Sb (and Is,' and l sd) symmetrically equivalent. Analogously, the ls designation really means o in local diatomic symmetry, which includes expansion in all the s-, p- and d-type basis functions of local o symmetry. The resulting energy dissociation curves for the level II ~E8+ electronic ground state, and separately calculated covalent (He+:He § configuration are shown in Figure 1. Here, as noted earlier, the energy curve for the He+:He § covalent configuration is not purely coulombic repulsive (---l/R), as has been assumed in the avoided crossing model~
but shows a 0.80 eV energy
minimum at short internuclear distances (tK---0.711A) which is the barrier to dissociation at R~--1.034A. The attractive part of the covalent configuration energy profile can be interpreted as a stabilizing charge-induced dipole interaction superimposed on the monotonically coulomb repulsive He+:He+ curve, as discussed above~176 Alternatively, and more likely, the minimum should be interpreted as mainly "normal" two center spin-coupled covalent bonding between the He § cations. The stability of such a bond would be enhanced by the calculated close approach of the two He § ions which would allow penetration attraction between the electrons on each ion and the other nucleus. The total level II calculated barrier energy (Eb) for the ground state curve, defined as the energy difference between the minimum (R~--0.726A) and barrier maximum (R~---1.115/lt) distances, is 1.00 eV. Thus the covalent curve by itself shows --80% of the barrier energy. The remaining 0.2 eV of the barrier energy comes from covalent-ionic mixing or resonance energy, even though at the equilibrium bond distance the ionic-covalent energy gap is --20 eV. In fact, the _
covalent-ionic resonance interaction contributes --0.82 eV to get from the purely covalent configuration energy to the level II ground state energy near the equilibrium He-He distance, and this gap decreases to --0.63 eV at -~Rb. Thus, covalent-ionic mixing in the He22§ dication is relatively large(~7'~s). The calculated exothermicity from the energy minimum to the monocation asymptotes is ---9.34 eV(19) for the level II ground state and --10.17 eV for He+:He § alone. In summary, He22§ has a strongly bound covalent configuration energy curve which shows all the
428 features of the full ground state energy curve. However, the total binding energy is determined essentially by covalent-ionic mixing. The results obtained here can also be used to improve the avoided crossing model, besides indicating the need for adding an effective polarization potential to the coulomb repulsive covalent curve. The large covalent-ionic mixing in the semiempirical avoided crossing model ~4'~5) has been calculated using the Wolfsberg-Helmoltz formula~ for the covalent-ionic resonance interaction
(I-Ii2); HI2 = KGl2[Hll + Hz2]
(4)
where Hl~ and 1-122 are the diagonal Hamiltonian matrix elements for the covalent and ionic configurations, respectively, and G~2 is the overlap integral between them. Equation (4) can be inverted to solve for values of K at each internuclear distance. Using the level II VBSCF calculated values ofH~, Hz2 and G~2, K is found to be a near constant, with an average value of --1.215, and slowly increasing in value from K=1.195 at R=0.65A to K=1.235 at R=I.50A. He22+ may not be the most representative example of a chemical system. An analogous analysis of the SiH3-F energy dissociation curve shows K--1 throughout the whole interfragment distance range (26).
3. 022+ The oxygen diatomic dication has been studied extensively, both experimentally and theoretically(5 6 37 57) The equilibrium internuclear distance (R~) has been measured at 1.073A(49), " "
"
.
which is probably the shortest O-O distance known(45). The best estimate for the dissociation energy (D~) to two ground state oxygen atom monocations is ---90 kcal/mole (45'47'5t'52'57), where the minus sign indicates exothermicity. The O-O distance at the barrier maximum (R~) is relatively sensitive to the theory level and has been estimated to be ~1.64A (52'57). On the other hand, the computed height of the barrier (Eb) seems to be stable to calculational improvements at --80 kcal/mole ~45'52), measured from R~ to P,q,. These bond distance and energy values delineate the energy dissociation curve of 022+ to its ground state atomic monocation asymptotes. 022+ is isoelectronic and isovalent with N2 and both have a formal triple bond. The 14 electrons in the dication are divided into a six electron active space and an eight electron passive space. The latter consist essentially of the doubly-occupied ls and 2s atomic orbitals on each center. In molecular orbital theory the electronic configuration of the active or triple bond part of
429 the wave function is written as, (~2~4, or more explicitly, (~z271;x2/1;y2 in cartesian representation, where each doubly-occupied o and n bonding MO is spin-paired. Here, Oz, nx and ny are appropriate symmetry combinations of the z, x and y basis functions on each atom center (O, and O0. The VB description of the active space starts from the atom localized orbitals and bonding is achieved through spin pairing on the interatomic level. The basic spatial configuration is: z, lzblx, lxbly~lybl
(5)
with each active space VB orbital singly-occupied. This configuration also has the correct ground state spatial symmetry. Multiplying configuration (5) is a six spin product function with three (z and three 13 spins. Twenty determinants can be constructed that distribute these six spin functions among the six orbitals to give Ms=0. Such a product function of six spins can belong to electronic states with total spin S=0,1,2 and 3. The ground electronic state of 022+ is a singlet spin state (S=0) and there are only five linearly independent combinations of the twenty Ms=0 determinants that belong to S=0. Therefore, configuration (5) and the appropriate spin functions give rise to five linearly independent VB structures, each with its own variationaUy determined coefficient in a multi-structural expansion of the ground state wave function. Among these five VB structures arising from configuration (5) there is only one structure that has simultaneous two center spin pairing between z, and zb, x, and xb, and y, and Yb, and can be considered to represent the pure covalent triple bond; (Z,:Zb)(X,:Xb)(y,:yb). This structure is listed as number 1 in Table 1. The other 4 structures belonging to configuration (5) also have elements of only one spin-paired interatomic covalent bond each, with the remaining coordinate pairs having parallel spins each. The particular spin eigenfunctions used in this study, as the default option in TURTLE (3~ are called Ruiner functions ~
A descriptive listing of the occupancies, configurations and structures is given in
Table 1. Additional configurations to (5) that belong to the ground state ~Eg symmetry and maintain an equal number of electrons on each center (a covalent distribution) can be constructed by not restricting the occupancy of each of the active set of VB orbitals to one electron, while still maintaining an equal number of cz and [3 spins. Another way of looking at this is to consider "promotions" of electrons starting from configuration (5) that create doubly-occupied orbitals among the active set, but maintain an equal number of electrons on each center. This latter restriction to covalent configurations requires that two electrons be "promoted" simultaneously,
430 for balance. If the additional constraint is imposed that, overall, no more than two electrons can occupy x, y or z orbitals each on both centers combined, then the set of configurations 2 to 7 in Table 1 is obtained. As is clear from Table 1, configurations 2 to 7 have only a single two center spin-paired covalent bond each and only one structure per configuration. Configurations 1 to 7 (structures 1 to 11) represent the covalent part of the electronic ground state of 022+. The ionic or charge transfer configurations are considered next. In these, two electrons are distributed among the three active orbitals on one center and four electrons are distributed among the three orbitals on the other center; again using three ct and three 13 spins, and with the restriction of not having more than two electrons in any given direction (x, y or z). The resultant configurations 8 to 13 for 02+0 ~ are also shown in Table 1. Configurations 8 to 10 have four unpaired spins, which give rise to two linearly independent S=0 structures each formed from six determinants. Structures 12 to 17 have two covalent type spin-paired bonds each. On the other hand, configurations 11 to 13 (structures 18 to 20) have no covalent bonds between the two centers. There is an equivalent set of opposite direction ionic 0002+ electron occupancies labeled configurations 14 to 19 (structures 21 to 29) which are also listed in Table 1. Finally, two electron ionic or charge transfer configurations, 03+0 - and O'O 3+, for the ground electronic state of O2z+ can also be constructed using the same set of spin and symmetry restrictions described above. These give rise to configurations 20 to 22 (structures 33 to 35) for the former, and configuration 23 to 25 (structures 33 to 35) in the latter case. As shown in Table 1, these configurations each show a single spin-paired bond between the centers. VBSCF calculations were carried out using the Huzinaga 9~5p oxygen atom basis set t59) contracted [5111/311 ]. Single sets of s- and p-type diffuse gaussians were added to the atom set on each center with exponents 0.0862 and 0.0637, respectfully, taken from an even-tempered extension of the respective valence functions. In addition, two d-type gaussian functions (5 components each) were used with each oxygen atom having exponents 1.050 and 0.3000. All the active and passive VB orbitals were atom localized, except as indicated below, and were expanded in the full set of basis functions appropriate to the local o or g diatomic symmetry of each VB orbital. As noted in the discussion on He22+, we can expect the oxygen orbitals for the covalent structures (1 to 11) to be different from those of the (singly) ionic structures (12 to 20 and 21 to
431 29); and certainly expect the orbitals of the two sets of ionic structures [02+00 and 0002+] to be different from each other. Here, the situation is more complicated than in the single bond case, like He22§ where there is only one configuration and corresponding structure of each bonding type; covalent and two ionic. Ideally, perhaps, each structure in Table 1 should have its own set of VB orbitals. However, that would be too difficult and the approach adopted here was to use a different complete (passive+active) set of VB orbitals for all the structures belonging to a given bonding type. Thus, altogether 30 VB orbitals are used; ten each for the covalent and the two (singly) ionic bonding types. For comparison purposes, multi-structure VBSCF calculations were also carried out using a common set of (ten) passive and active VB orbitals for all 35 structures in Table 1, including the doubly-ionic structures. This theory level is labeled level I. Level II uses a different set of orbitals for each different bonding type, as described above, but encompasses only the 29 structures representing the covalent and (singly) ionic bonding types in Table 1. The O22§ energy dissociation curve was generated pointwise as a function of O-O distance (R), and the extremum point (energy minimum and transition state) bond lengths and energies were obtained by quadratic or cubic fitting of the bracketing points. The resulting energy dissociation curves are shown in Figure 2. The calculated values of the equilibrium and barrier top distances, as well as the dissociation and barrier height energies, are tabulated in Table 2. These are also compared with experiment and previous results. Two more levels of calculation were carried out, but only for points in the immediate neighborhood of the energy minimum and maximum. Level III adds the two electron transfer configurations 20 to 25 (structures 30 to 35) to the level II list, where the doubly-ionic structures have the same set of VB orbitals as the corresponding direction singly-ionic structures. Thus in Level III, structures 12 to 20 and 30 to 32 have the same set of ten VB orbitals, while structures 21 to 29 and 33 to 35 share their own, different set of ten VB orbitals. The covalent structures (1 to 11), of course, also have their own set of passive+active VB orbitals. Level IV builds on level III by allowing delocalization mixing among the passive VB orbitals on different centers for all the bonding types. As noted above, the doubly-occupied passive orbitals represent essentially the l s and 2s atomic orbitals on each oxygen center and are not shown in Table 1. The delocalized mixing of these inner electrons is not expected to be highly significant, but was examined. In the level IV calculations the six active VB orbitals shown in Table 1 remain atom localized.
432 All four levels of calculation (I to IV) have exactly the same covalent structure asymptotic energy. Therefore, the progression from level I to level IV represents an increase in accuracy level of the calculations, as shown by the decreasing exothermicity (Table 2) measured from the respective equilibrium distances to the common O § + O § ground state dissociation limit. Due to the nonorthogonality of the VB orbitals and, consequently, of the VB structures, the weight of each structure i (Wi) is calculated from the VBSCF structure expansion coefficients (Ci) using the formula~6~
Wi = Ci ~ SijCj J
(6)
where S~j is the overlap integral between structures i and j. Figure 2 shows the 022+ ground state energy dissociation curve for the level I and level II calculations. As discussed above, the 27 structure representation of the VB wave function (Table 1) divides into covalent and ionic bonding types. The energy dissociation curve for the covalenttype structure set alone (structures 1 to 11 in Table 1) can be calculated as a separate covalent only (O+:O§ ground state wave function. Analogously, one set of ionic structures (12 to 20) can also be used to VBSCF a separate, purely ionic O2§ ~ curve. These are also shown in Figure 2. The O§ § curve is seen to have a shape which closely mimics the level I and level II curves. Thus, in Table 2, an energy minimum is found at R~=l.039A and a barrier at th,=1.466A, with a barrier height energy (Eb) of 77.5 kcal/mole for the O+:O§ bonding type. The P~ and Eb values are not so very different from the corresponding parameter values calculated at levels I and II. In general in Table 2, P~ and Rb are seen to increase with the level of calculation and the O§ § calculated values for these distances fall into this trend at the low accuracy end. The calculated exothermicity for O§ § is 154.3 kcal/mole, which is 39% greater than the level II value; the most appropriate for comparison. An especially interesting quantity is the barrier height energy (Eb) at Ih, relative to the energy minimum at P~, which is calculated to be 77.5 kcal/mole for O§ § compared to 70.6 to 71.9 kcal/mole for levels I to IV. Thus the O§ § curve alone has the qualitative and almost the quantitative features of the full covalent+ionic ground state energy curve. Figure 2 shows that the ionic group curve, O~+O~ is asymptotically fiat, going to the 0 2++O~ dissociation limit. The curve falls rapidly as R gets smaller and reaches a deep minimum at R-1.00A. The individual energy curve behaviors of the covalent and ionic structure groups,
433 combined with the increasing covalent-ionic resonance interaction as R decreases, leads to the calculated increases in R~ and R~, and decreases in Eb and Dc in going from the O+:O§ level to level II. As noted previously, R~ seems to be particularly sensitive to the level of calculation. The combined weight of the ionic group structures, as calculated from equation (6) for the level II wave function, is found to uniformly decrease with R. Thus, at R =l.05A the combined weight of the covalent structures at level II is 45%, while at R=I.60A it is 74%. Thus, although the covalent structure group curve by itself already shows approximately the qualitative features of the full structure set curve at level II, covalent-ionic mixing is strong and quantitative accuracy requires taking into account the interaction between the already strongly bonding covalent configuration set and the more strongly attractive ionic set. A comparison between the levels I to IV results in Table 2 for the different distance and energy parameters with experiment and previous high level results shows improving agreement with increasing level of calculation. Probably, the most significant defect in the VBSCF calculations reported here is in the basis set which lacks f-type functions. The importance of f-type basis functions in O 22+ to the quantitatively accurate calculation of energy properties has been demonstrated previously~45'52'57). 4. N F :+
The NF 2§ diatomic dication has been studied both experimentally<61)and theoretically<45'62). Although valence isoelectronic with 022+ with a formal Gz2"/l:x271~y 2 MO electronic configuration, the electron distribution between the asymptotic constituent (N + and F § ions is not isovalent with the 022+ system. Therefore, the construction of the VB configurations is different in NF 2+ than for O 22+. Table 3 shows the set of active space covalent and ionic, configurations and structures, appropriate for NF ~-.NF 2+ is isovalent CO molecule which has a formal double bond. The covalent (N+:F+) configurations consist of three doubly-bonded configurations (Z,~Zb~X,~Xb~, Z,~Zb~y,~Yb~and 1
1
1
Ix
x, x~ y, Yb ) and three no bond types. For each double bond configuration the four unpaired spins give rise to two linearly independent VB structures with overall singlet state spin coupling. The no bond configurations have all the active space electrons in doubly-occupied VB orbitals, with no case of two center spin pairing that defines the covalent bond. Altogether, the N+:F§ covalent set consists of six configurations and nine structures. For obvious reasons the N+:F+ covalent structure group in Table 3 is exactly the same as the O~§ ~ ionic set in Table 1.
434 The N2+F~ ionic set has three configurations, each with only one two-centered, spin-paired bond. On the other hand, the N~ 2§ ionic set has the same configuration and structure list as O+:O§ in Table 1, including the triple bond configuration (number 10 in Table 3). We can, therefore, expect the NOF2§ ionic type bonding to be important to the ground state electronic structure description even though it is higher in energy than the N2~F* ion asymptote. The doubly-ionic N a T set has only one member in the configuration list, with all the active electrons located in doubly-occupied F orbital to describe closed shell F'. The doubly-ionic NOF3§ set is the mirror image of the N+:F+ set in electron count, with the same distribution of configurations and structures, but with the number of electrons switched between the nitrogen and fluorine centers. Finally, as noted in the discussion on O22+, this configuration list includes only those distributions of the active electrons that restrict the occupancy of any coordinate direction (x, y or z) to no more than two electrons each. Additional configurations that have the proper overall electronic ground state l~ symmetry can be constructed by allowing four active electrons in a given coordinate direction, double-occupied in both centers. Previous experience has shown that such VB configurations are too high in energy to have any significant contribution to the electronic structure description. As with O22+, the VBSCF calculations on NF 2+ were carried out on several levels. Level I includes all 33 structures described in Table 3 arising from the covalent, and singly- and doublyionic configurations. The VB orbitals used to construct all the configurations (including the doubly-occupied passive electrons not shown in Table 3) are common to all the bonding types and are atom localized. Thus, ten VB orbitals are optimized at each N-F internuclear distance (R). In the level II calculations only the first 16 configurations (23 structures) listed in Table 3 were used for the wave function, but a different set of ten (passive+active) orbitals was used for each of the three bonding types; N+:F+, N2+F~ and NOF2+. All the structures belonging to a given bonding type used the same set of ten VB orbitals. Altogether at level II, 30 VB orbitals were optimized at each internuclear distance. At level III the doubly-ionic structures (configurations 17 to 23) were added to the level II set and only the extremum points (1L and Ih,) were recalculated using a grid of adjacent points. In level III the same set of ten VB orbitals is used for the singly- and doubly-ionic structures showing electron transfer in the same direction. The total number of optimized VB orbitals remains at 30. The level IV calculations start from level III and add delocalized mixing
435 among the doubly occupied passive VB orbitals on the two centers for all the structures. The active orbitals remain atom localized. The progression from level I to level IV represents steady improvement in theory level(2~'31) The valence+polarization basis sets for the nitrogen and fluorine atoms were taken from the Huzinaga compilation, as described for the oxygen atom ~
The d-type function exponents on
nitrogen are 1.000 and 0.2800, and 1.100 and 0.3200 for fluorine. The energy dissociation curves for the electronic ground states of NF 2§ were generated pointwise and the extremum points fitted/interpolated from the surrounding values. The energy curves for the N+:F+, N2+F ~ and N~ 2+ bonding types were generated in separate VBSCF calculations for each using structures 1 to 9, 10 to 12, and 13 to 23, respectively, from Table 3. The ground state energy curves at levels I and II, and for the three bonding types are shown in Figure 3. The calculated internuclear distances and energy difference values for the extremum points are tabulated in Table 2, including a comparison with previous work. Although NF 2+ has been detected experimentally~6~), there are no reported spectroscopic constants for comparison with the calculated values. The highest level previously obtained theoretical values used a multi-reference configuration interaction (MRCI) method in a large gaussian basis set that included f-type functions on each centert62( The lack of such f-type basis functions is probably the greatest deficiency in the VBSCF calculations reported here. Still, the current results generally compare very well with previous work ~45'62)and this is shown in Table 2. The calculated value of ~
seems to have stabilized at -~I.10A at all method levels. The
dissociation energy from P~ to the asymptotic N + + F + limits at -25.0 kcal/mole is still about 2.5 kcal/mole too exothermic at level IV compared to the best previous resultst61( This difference can be attributed to the absence of f-type functions in the basis set, which is known to particularly affect dissociation energies in multiply-bonded systems. The location of the barrier height (P~) is sensitive to theory level, as expected, since it lies at what should be at, or near, the most multiconfigurational point on the energy curve. The progression of P-q,values from 2.054A at level II to 2.061A at level IV agrees well with the higher level previously calculated P-q, distance of 2.085A~62( Finally, the barrier height (Eb) in Table 2, measured from ~ to 1~, is also converging from level II to level IV to the best previous value of 112.5 kcal/mole, with a remaining difference
436 of--2.2 kcal/mole at level IV. In summary, the VBSCF calculated extremum point distances and energies are in good agreement with higher level MO theory results. The covalent N+:F+ curve alone, shown in Figure 3, has a substantial (94.8 kcal/mole) binding energy relative to the barrier height, which is almost comparable to the ground state barrier energy at level II (109.2 kcal/mole). The shape of the covalent curve parallels those of the level I and II ground state curves, including the barrier, but with the barrier height distance (R,) -4).4A shorter than in I and II. The N+:F+ curve is shitted to higher energy at shorter N-F distances (R<2A) and merges into the asymptotic ground state curves at larger internuclear distances. Therefore, the N+:F+ dissociation energy from 1~ to asymptotic N + + F + is much more exothermic than for the level I and II ground states, while the barrier energy from IL to R~ is calculated to be close to the ground state values. Both the shape of the N+:F+ curve and the large calculated barrier energy (Eb) indicate that the N+:F+ structure set is showing real covalent binding. The ionic N2+F~ and N ~ 2+binding energy curves are also shown in Figure 3. The curve for N2+F~ which has the lower asymptotic energy, has an energy minimum at R-~l.075A, which is close to both the ground state and covalent N+:F+ energy minima. The higher lying N~
2+
asymptote energy curve has an even deeper minimum at R--1.00A, with the hint of a small barrier at R--1.78A of indeterminate, but small height. The deeper minimum of the N~ 2+ ionic bonding is, perhaps, a reflection of better bonding in terms of the number of two center spin-paired couplings, as noted above. Thus, N~ 2+ ionic bonding, which is isovalent with N2, actually has one triple and six single bond configurations in its VB structure list (Table 3), while N2+F~ has only three single bond couplings. The relative stability of the N ~ 2+ set is, therefore, especially large at shorter internuclear distances where multiple bonding is particularly effective, and the separately summed VB weights [equation (6)] of the two ionic sets are very similar at small R. At least half of the weight of the N~ 2+ structures is due to the triple bond configuration along the whole level II ground state energy curve. The electronic structure of the asymptotic N + + F + ions is a linear combination of configurations 1 and 2 (structures 1 to 4)in Table 3, with equal coefficients. Configurations 1 and 2 describe the simultaneous presence of both a o and a n (two center, spin-coupled) bond, while configuration 3 has two ~ and no o bonds. The calculated structure weights of the ground state at
437 level II show that the relative importance of configurations 1 and 2 relative to 3 is maintained along the whole energy dissociation curve. Thus, the a bond in N+:F+ is more important than the rc bond and the effective overall bonding electronic configuration of N+:F+ is a2rc2 rather than the alternative a~ 4. This means that the fluorine electron pair in the active space is preferentially located in the ~ space and not in the a space as might be expected on purely polarization interaction considerations between N + and F § This result argues for a covalent bond interaction in N+:F+. The same behavior, however, is not found in the isovalent 02+0 o set, where configuration 10 in Table 1 (a ~ ~4) grows in importance relative to 8 and 9 (a 2 ~2) as R increases from below R~ to R-~l.50, before all three decay to zero. The combined weight of configuration 10 is largest along the whole energy curve. In the N2+F~ ionic set, configuration 7 (Table 3) represents a single two center spin-paired a bond, while 8 and 9 have one rc bond each. The weight of the a bond configuration is calculated to be larger than the weights of the 7t bond configurations along the whole level II energy curve. All the ionic structure coefficients decay to zero as the asymptotic N + + F § limit of the ground state is approached. For the NOF2+ ionic set, structures 18 and 19 have one a bond and structures 20 to 23 have one ~ bond each. The largest weights here are registered for the n bonding configurations, along the whole ground state energy curve. The ~ bonding configurations represent the interaction between a z2xI (or z2y~) distribution on one center with the corresponding xly2 (or x2y~) distribution on the other center. Structures 20 and 21 have the z2 electrons on N ~ and, therefore, they have an additional dative-type bond with F 2§ in the a space which enhances their weights in the total wave function. If the weights of the three types of bonding, N+:F+, N2+F~ and NOF2§ are summed separately for each of their respective contributing structures using the level II structure expansion coefficients for the ground state, then near R, the N+:F§ set is --43%, N2+F~ is --31% and N ~ 2§ is ~27% of the VBSCF wave function. A similar distribution of weights was found in 02 ~-§near its P~. The relatively large contribution of the latter ionic group is apparently due to its stability at smaller R (Figure 3), as discussed above, and, perhaps, also to resonance interaction with the covalent structures. An interesting aspect of this distribution of weights is that although the N-+:F§ curve alone shows substantial binding (Table 2) behind the barrier, its weight in the neighborhood of Re is less than 50% of the total wave function. Another curious aspect of the
438 calculated weights near R, is the near equivalence of the two ionic bonding sets, which is similar to the situation found in the homopolar dissociation of a homonuclear diatomic molecule. The usual situation in heteronuclear diatomics is that one ionic type dominates. The apparent paradox of large calculated ionic weights in a VB framework in what basically looks like a covalent situation has been commented upon by Sason{z3)in SiH3-CI, and is connected to the covalent-ionic resonance interactions, which in NF z+ involves both ionic bonding types. 5. Summary The He2z+, Ozz+ and NF 2+ diatomic dications have ground state energy curves that are characterized by a short equilibrium bond distance, an exothermic dissociation energy and a barrier to dissociation that bestows kinetic metastability on the dication. Due to their finite lifetimes all three dications have been detected experimentally. The electronic structure description of these diatomic dications has been explored using ab initio multi-structure VB self consistent field theory in extended atomic plus polarization basis
sets. The ground state wave functions are expressed as linear combinations of covalent and ionic bonding configurations. Attention is focused on the degree of two center spin-coupled bonding in the covalent interactions, in the presence of the coulomb repulsion between the monocations. The calculated quantities are the equilibrium bond length (1~), the barrier top distance (1~,), the barrier energy measured from R, to P,q, (Eb), and the dissociation energy (De) measured from R, to the asymptotic monocations. He22+ has a classical two center spin-coupled covalent configuration that reproduces --80% of the overall Eb value and has a binding energy curve that shows all the features of the full ground state curve. However, the He2~-+binding energy is determined quantitatively by covalentionic resonance. O22+, isovalent with N2, has a multi-structural description of the covalent and ionic bonding modes, including a triple bond component. Although the covalent group by itself shows all the qualitative bonding and barrier features of the full ground state curve, quantitative accuracy requires mixing between the covalent and ionic bonding type sets. NF 2+, isovalent with CO, has predominantly double bond covalent-type bonding of the a2~ 2 form. The contribution of ionic NOF2+ to the ground state wave function is similar to that of the lower asymptotic energy N2+F* ionic bonding because the former has both a triple bond
439 configuration component and an attractive N (o 2) - F 2+ ( o ~ interaction component. Again, as found for O22+, although mimicking the full ground state energy dissociation curve, covalent-ionic resonance is quantitatively important. Therefore, as generally described by Pauling, the diatomic dications having formal single (He22§ double (NF 2+) or triple (022+) bonds all seem to be well described as ordinary chemical bonds plus electrostatic repulsion(sq~ 6. Acknowledgement
This work was supported by a grant from the Israel Science Foundation (484/95). 7. References
(1)
L. Pauling, J. Chem. Phys., 1, 56 (1933).
(2)
M. Guilhaus, A. G. Brenton, J. H. Beynon, M. Rabrenovic and P. von R. Schleyer, J. Phys. B, 17, L605 (1984).
(3)
A. Belkacen, E. P. Kanter, R. E. Mitchell, Z. Vager and B. J. Zabransky, Phys. Rev. Lett., 63,2555(1989).
(4)
L. Pauling, "The Nature of the Chemical Bond", 3rd Ed., Cornell University Press, Ithaca, New York (1960).
(5)
A. C. Hurley and V. W. Maslen, J. Chem. Phys., 34, 1919 (1961).
(6)
A. C. Hurley, J. Mol. Spectrosc., 9, 18 (1962).
(7)
C. Nicolaides, Chem. Phys. Lett., 161, 547 (1989).
(8) (9)
P. Jonathan, R. K. Boyd, A. G. Brenton and J. H. Beynon, Chem. Phys., 110, 239 (1986). J. Senekowitsch and S. ONeil, J. Chem. Phys., 95, 1847 (1991).
(lO)
J. Senekowitsch, S. ONeil and W. Meyer, Theor. Chim. Acta, 84, 85 (1992).
(11)
C. Brechignac, M. Broyer, Ph. Cahuzac, G. Delacretaz, P. Labastie and L. Woste, Chem. Phys. Lett., 118, 174 (1985).
(12)
G. Durand, J. P. Daudey and J. P. Malrieu, J. Phys. (Pads), 47, 1335 (1986).
(13)
G. Durand, J. P. Daudey and J. P. Malrieu, Lecture Notes in Physics, 269, 127 (1987).
(14)
P. M. W. Gill and L. Radom, Chem. Phys. Lett., 147, 213 (1988).
(15)
L. Radom, P. M. W. Gill and M. W. Wong, Int. J. Quant. Chem. Symp., 22, 567 (1988).
(16)
M. Kobuszewski, J. S. Wright and R. J. Buenker, J. Chem. Phys., 102, 7519 (1995).
(17)
M. Kolbuszewski and J.-P. Gu, J. Chem. Phys., 103, 7649 (1995).
440
(18)
N. Levasseur, Ph. Mime, P. Archirel and B. Levy, Chem. Phys., 153, 387 (1991).
(19)
H. Basch, P. Aped and S. Hoz, Chem. Phys. Lett., 255, 336 (1996).
(20)
G. Sini, P. Maitre, P.C. Hiberty and S. S. Shaik, J. Mol. Stmct., 229, 163 (1991).
(21)
S. ShailL P. Maitre, G. Sini and P. C. Hiberty, J. Am. Chem. Soc., 114, 7861 (1992).
(22)
H. Basch, P. Aped and S. Hoz., Mol. Phys., 89, 331 (1996).
(23)
D. Lauvergnat, P. C. Hiberty, D. Danovich and S. Shaik, J. Chem. Phys., 100, 5715 (1996).
(24)
G. Frenking, W. Koch, D. Cremer, J. Gauss and J. F. Liebmann, J. Phys. Chem., 93, 3397 (1989).
(25)
N. Levasseur and P. Millie, J. Chem. Phys., 92, 2974 (1990).
(26)
H. Basch, J. L. Wolk and S. Hoz, submitted for publication.
(27)
J. H. van Lenthe and G. G. Balint-Kurti, Chem. Phys. Lett., 76, 138 (1980).
(2S)
J. H. van Lenthe and G. G. Balint-Kurti, J. Chem. Phys., 78, 5699 (1983).
(29)
J. Verbeek, "Nonorthogonal Orbitals in Ab lnitio Many Electron Wavefunetions", Ph. D. Thesis, Utrecht University, Utrecht, Holland (1990).
(30)
J. Verbeek, J. Langenberg, C. P. Byrman and J. H. van Lenthe, "TURTLE- An Ab Initio VB/VBSCF/VBCI Program", Theoretical Chemistry Group, Debye Institute, Utrecht University (1995).
(31) P . C. Hiberty, S. Humbel, C. P. Byrman and J. H. van Lenthe, J. Chem. Phys., 101, 5969 (1994). (32)
H. Yagisawa, H. Sato and T. Watanabe, Phys. Rev., A,16, 1352 (1977).
(33)
P. C. Hiberty, J. P. Flament and E. Noizet, Chem. Phys. Lett., 189, 259 (1992).
(34)
D. Lauvegnat, P. Maitre, P. C. Hiberty and F. Volatron, J. Phys. Chem., 100, 6463 (1996).
(35)
H. Partridge, J. Chem. Phys., 90, 1043 (1989).
(36)
M. Wolfsberg and L. Helmholtz, J. Chem. Phys., 20, 837 (1952).
(37)
N. H. F. Beebe, E. W. Thulstrup and A. Andersen, J. Chem. Phys., 64, 2080 (1976).
(38)
J. H. Agee, J. B. Wilcox, L. E. Abbey and T. F. Moran, Chem. Phys., 61, 171 (1981).
(39)
J.M. Curtis and R. K. Boyd, J. Chem. Phys., 81, 2991 (1984).
(40)
K. Tohji, D. M. Hanson and B. X. Yang, J. Chem. Phys., 85, 7492 (1986).
441 (41)
H. Sambe and D. E. Ramaker, Chem. Phys., 104, 331 (1986).
(42)
B. X. Yang, D. M. Hanson and K. Tohji, J. Chem. Phys., 89, 1215 (1988).
(43)
J. H. D. Eland, S. D. Price, J. C. Cheney, P. Lablanquie, I. Nenner and P. G. Fournier, Phil. Trans. R. Soe. Lond. A 324, 247 (1988).
(44)
A. T. Balaban, G. R. de Mare and R. A. Poirier, J. Mol. Struct., 183, 103 (1989).
(45)
M. W. Wong, R. H. Nobes, R. H. Bouma and W. J. Radom, J. Chem. Phys., 91, 2971 (1989).
(46)
H. Hamdan and A. G. Brenton, Chem. Phys. Lett., 164, 413 (1989).
(47)
M. Larsson, P. Baltzer, S. Svensson, B. Wannberg, N. Martensson, A. N. de Brito, C. Correia, M. P. Keane, M. Carlsson-Gothe and L. Karlsson, J. Phys. B: At. Mol. Opt. Phys., 23, 1175 (1990).
(48)
H. Yang, D. M. Hanson, F. V. Trentini and J. L. Whitten, Chem. Phys., 147, 115 (1990).
(49)
N. Saito and I. H. Suzuki, J. Chem. Phys., 93, 4073 (1990).
(5o)
W. J. van der Zande, Chem. Phys., 157, 287 (1991).
(51)
R. H. Nobes, D. Moncrieff, M. W. Wong, L. Radom, P. M. W. Gill and J. Pople, Chem. Phys. Lett., 182, 216 (1991).
(52)
L. G. M. Petterson and M. Larsson, J. Chem. Phys., 94, 818 (1991).
(53)
H. Basch. J. Mol. Struct., 234, 185 (1991).
(54)
H. Basch, S. Hoz, M. Goldberg and L. Gamss, Isr. J. Chem., 31, 335 (1991).
(55)
R. B. Murphy and R. P. Messmer, J. Chem. Phys., 97, 4170 (1992).
(56)
J. Fournier, P. G. Fournier, M. L. Langford, M. Mousselmal, J. M. Robbe and G. Gandara, J. Chem. Phys., 96, 3594 (1992).
(57)
H. Basch, S. Hoz and M. Goldberg, Isr. J. Chem., 33, 403 (1993).
(58)
R. Pauncz, "Spin Eigenfunctions, Construction and Use", Plenum Press, New York (1979).
(59)
S. Huzinaga, J. Chem. Phys., 42, 1293 (1965).
(60)
B. H. Chirgwin and C. A. Coulson, Proc. Roy. Soc. Lond., Ser. A, 201, 196 (1950).
(61)
S. A. Rogers, P. J. Miller, S. R. Leone and B. Brehm, Chem. Phys. Lett., 166, 137 (1990).
(62)
J. Senekowitsch, S. V. ONeil, H.-J. Werner and P. J. Knowles, J. Phys. B: At. Mol. Opt. Phys., 24, 1529 (1991).
442
Table 1. V B Occupancies, configurations and structures for 022+. Config. number 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
< . . . . . . . . . . Orbital occupancies a. . . . . . . . . . > z, Zb X, Xb y, yb 1 1 1 1 1 1 1 1 0 2 2 0 1 1 2 0 0 2 2 0 1 1 0 2 2 0 0 2 1 1 0 2 1 1 2 0 0 2 2 0 1 1
1 1 0 2 " 0 0 1 1 2 2 2 0 1 0 0 1 2 2
a. a=O,; b=Ob.
1 1 2 0 2 2 1 1 0 0 0 2 1 2 2 1 0 0
0 1 1 0 0 2 2 1 1 0 2 2 0 0 1 2 2 1
2 1 1 2 2 0 0 1 1 2 0 0 2 2 0 0 0 1
1 0 1 0 2 0 1 2 1 2 0 2
1 2 1 2 0 2 1 0 1 0 2 0
0 1 0
2 1 2
2 1 2
0 1 0
Struct. number 1-5 6 7 8 9 lO 11
12,13 14,15 16,17 18 19 20 21,22 23,24 25,26 27 28 29 30 31 32 33 34 35
charge distribution Oa +[Ob +
Oa~
2+
O.3+Of
0 - ~ 3+
443
Table 2: Calculated internuclear distances and energies of extremum points of ground state energy curves. Dication . 022+
NF 2+
a.
b. C.
d. e.
f. g. h.
Level O+:O+c I II III IV Exp. f Prey. 8
Rr (A) 1.039 1.061 1.064 1.064 1.066 1.073
1~b (A) 1.466 1.555 1.589 1.592 1.600 --1.64
N +:F+h 1.072 1.662 I 1.098 1.917 II 1.101 2.054 III 1.102 2.055 IV 1.104 2.061 Ref. 45 1.102 1.97 Ref. 62 1.103 2.085 1~ = Equilibrium bond length. lh, = Barrier maximum internuclear distance. Barrier height energy measured from P~ to lh,. Dissociation energy measured from 1L to asymptotes. Covalent type cofigurations 1 to 7 in Table 1. See ref. 49. See text. Covalent type configurations 1 to 6 in Table 3
Ebc (kcal/mole) 77.5 71.4 70.6 71.2 71.9
Dcd (kcal/mole) -154.3 -123.7 -110.9 -110.1 -107.5
--80
-~-90
94.8 105.3 109.2 109.6 110.3 109.2 112.5
-79.9 -42.3 -27.8 -27.4 -25.0 -22.7 -22.4
444
Table 3. VII Occupancies, configurations and structures for N F 2+"
Config. number 1 2 3 4 5 6 7 8 9
< . . . . . . . . . . Orbital occupancies' . . . . . . . . . . > z, Zb x,, Xb y, Yb 1 1 1 1 0 2 1 1 0 2 1 1 0 2 1 1 1 1 2 0 0 2 0 2 0 2 2 0 0 2 0 2 0 2 2 0 1 1 0 2 0 2 0 2 1 1 0 2 0 2 0 2 1 1
10 11 12 13 14 15 16 17 18 19 20 21 22 23 a. a=N; b=F.
1 1 1 2 2 0 0 0 1 1 2 0 2 2
1 1 1 0 0 2 2 2 1 1 0 2 0 0
1 2 0 0 1 2 1 0 1 2 1 2 2 0
1 0 2 2 1 0 1 2 1 0 1 0 0 2
1 0 2 1 0 1 2 0 2 1 1 2 0 2
1 2 0 1 2 1 0 2 0 1 1 0 2 0
Struct. number 1,2 3,4 5,6 7 8 9 10 11 12
13-17 18 19 20 21 22 23 24 25,26 27,28 29,30 31 32 33
charge distribution N+:F +
NoF 2+
N3+F
N - F 3+
445
Captions to Figures
Fig. 1 He2 2+ dissociation energy curves for the ground state (level II) and the covalent configuration alone (He§247
Fig. 2 022+ dissociation energy curves for the ground state (levels I and II), and individual covalent (0+:0 +) and ionic (02+0 ~ configuration sets.
Fig. 3 NF 2+ dissociation energy curves for the ground state (levels I and II), and individual covalent (N+:F+) and ionic (N2+F~ and NOF2+) configuration sets.
446
-3.55
+
+
He "He -3.60
-3.65 2+ He
2
-3.70
-3.75
I
o.so
I
]
1.00
1.50
H e - H e Distance (A)
!
2.00 Figure 1
447
-148.0
d§ o
O :O
-148.5
II s
1.0
2.0
3.0
O - O Distance (A)
4.0 Figure 2
448
0 2+ NF
-152.0
~.__~_________-------------~
-152.5
II 1.0
2.0
N - F Distance (A)
3.0
4.0
Figure 3
Z.B. Maksid and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modem Modelling of the Chemical Bond
449
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
One-electron and three-electron valence structures
chemical bonding,
and increased-
Richard D. Harcourt School of Chemistry, University of Melbourne, Parkville, Victoria 3052, Australia Valence bond and molecular orbital approaches to descriptions of oneelectron bonds and Pauling three-electron bonds are reviewed, and attention is given to the incorporation of Pauling three-electron bonds into the valence bond s t r u c t u r e s for molecular systems t h a t involve four-electron three-centre, fiveelectron three-centre and six-electron four-centre bonding units. Many of these valence bond structures are examples of increased-valence structures. Examples are provided for a few of the large number of phenomena t h a t involve these types of bonds and valence bond structures. 1. I N T R O D U C T I O N
P a u l i n g ' s contributions to valence bond (VB) theory are inscribed indelibly on our collective conciousness via his development and use of concepts s u c h as o r b i t a l h y b r i d i z a t i o n , r e s o n a n c e , e l e c t r o n e g a t i v i t y , a n d the electroneutrality principle. Another concept, which Pauling introduced [1,2] - the so-called diatomic (Pauling) "three-electron b o n d " - has received less attention. With a minimal basis set, the three-electron bond involves three electrons that are distributed amongst two overlapping atomic orbitals (AOs), which are located on two atomic centres. Alternative designations for this type of bond include threeelectron two-centre bond, three-electron two-orbital bond and three-electron hemibond [3]. It is fair to say t h a t Pauling did not appear to realize how the threeelectron bond is able to play a key role in the provision of VB descriptions of the electronic structures of electron-rich molecules. However a proper understanding of the n a t u r e and representation of this type of bond leads to the reformulation of the conceptual basis for a large component of qualitative VB theory [4-7]. This reformulation is achieved via the construction and use of increased-valence structures, which arise when the three-electron bond is incorporated into the VB structures of diamagnetic as well as paramagnetic molecules. Although numerous references to some of the recent publications on the three-electron bond have been provided recently [8], no consideration was given in these references to the very extensive l i t e r a t u r e t h a t exists with r e g a r d to the formulation and use of increased-valence structures.
450 As a tribute to Pauling's contributions, I shall restate and summarize some of the implications for bonding theory that arise when the three-electron bond is incorporated as a m a i n s t r e a m component for VB descriptions of the electronic structures of electron-rich molecules. Attention will be focussed on increased-valence structures for molecular systems t h a t involve four-electron three-centre and six-electron four-centre bonding units. However initially, consideration will be given to the one-electron bond, for which Pauling also provided some attention to both the theory and examples of systems t h a t involve this type of bond in their VB structures. As indicated in ref. [8(a)], experimentally oneelectron bonds and three-electron bonds are abundant and well-characterized for odd-electron systems.
2. THE ONE-ELECTRON BOND The ground-state for the simplest molecular system, H2 +, involves a oneelectron bond. Pauling used the LCAO (linear combinations of AOs) molecular orbital (MO), ~/+ - (a + b)/(2 + 2Sab )+0.5 to accommodate the bonding electron, for which a and b are nuclear-centred overlapping AOs t h a t are oriented so t h a t the AO overlap integral S a b - is greater than zero. Pauling calculated approximate values for the binding energy and bond length for the ground-state of this system [9]. Pauling also indicated t h a t the LCAO wavefunction for the ground-state was equivalent to the wavefunction for resonance between the VB structures I a and lb,
(H
9 H) (+) =
(H
I
(+) H) Ia
Ib
with wavefunctions ~ I a - a and ~FIb = b. The equivalent VB representations are therefore those of I - I a e~ Ib. When constructive interference of overlapping AOs occurs, electron charge density accumulates in the internuclear region. For both H2 + and H2, Ruedenberg has demonstrated that this constructive interference process leads to a "crucial reduction" in the (molecular) kinetic energy [10], which occurs whether or not the virial theorem is obeyed [10], (see also ref. [11]). This lowering of the kinetic energy is calculated to be responsible for the bonding stabilization of the one-electron bond of H2 + [11,12]. The same result was also obtained for the electron-pair bond of H2 [10].The bonding for heteronuclear nuclear systems with one-electron bonds has also been studied by Ruedenberg and Feinberg [13]. We m a y write ~ab - (a + kb)/(1 + 2kSab + k2) +0"5 as the wavefunction for a heteronuclear one-electron bond, with k > 0 when Sab structure II
(A
II
9 B) (+)
(A
IIa
B (+))
(A (+) B) IIb
> 0, to give the VB
451 which is equivalent to (the lower-energy) resonance between the Lewis structures I I a and IIb. (The formal charges indicated are those that obtain when A and B are n e u t r a l atoms.) The coefficient k is a v a r i a t i o n a l l y - d e t e r m i n e d p o l a r i t y parameter. Ruedenberg and Feinberg [13] have demonstrated t h a t a stable oneelectron bond exists when the A and B nuclear charges differ by less t h a n about 20%, and t h a t the basic origin of covalent bonding is also associated with a reduction in the kinetic energy component of the constructive interference energy.
~',4,
(0)2
Si
.'Si
". Si
(G)1
(G)1
(0)2
(-1/2) (+1/2)
(+1/2) (+1/2)
". G a
". Si
9Si --PSi
9Si
(-) .9 G a
.9 S i
1
(O)2(O*)1
(0)2(0*)0
(0)2(0*) 0
(+1/2)(-1/2) As
9 Si
(+) .9 S i
.9 S i
.9 S i
.9 S i
--~As
3
(o)1
(O)2(~*)1
(-112) (-1/2) .9 S i
9 Si
" Si
4
(o)2(o*)1
(0)2(0*) 0
A
A
(0) 2
(+1/2)(+1/2) (-1/2) (-1/2) A
-A
(0)1
(+1/2)(+1/2) A .A
"A
(0)2
. 9A
5
(O)2(O*)1
A ' A
(O)2(0-*)0
(- 1/2) (- !/2)
A "A
A
.A
6
A
:A
$
(0)1
(+1/2)(+1/2) A-A
(0)2
(0)2
(0)2(0")1
(- 1/2) (-1/2) A ' A
A*A 7
A
"A
Figure I. Electron conduction in p and n type semiconductors, and alkali metals. (Cathode (-) on the lei~, anode (+) on the right.)
452 It may be noted that p-type semiconductors involve the formation of oneelectron bonds in their VB structures (cf. Figure 1). Consider silicon doped with gallium as an example of this class of semiconductor. Prior to doping, each silicon atom is involved in the formation of four electron-pair bonds. When a gallium atom replaces a silicon atom, an Si-Si electron-pair bond is broken, and replaced by for w ~ c h the vacant gallium orbital (a) overlaps with the singlyoccupied silicon orbital (b). A one-electron Ga-Si bond is thereby formed, as in G a 9 Si for which the electron occupies the bonding MO ($GaSi = a + kb. Under the influence of an electric potential, electron conduction may proceed via the transfer of an electron from a doubly-occupied Si-Si bonding MO (~SiSi into the singly-occupied bonding MO GGaSi, as indicated in 1 -o 2 of Figure 1. This process is followed by the transfer of an electron from a doubly-occupied ~GaSi bonding MO into the singly-occupied Si-Si bonding MO. (For simplicity in Figures 1 and 6, bonding MO formulations for electron-pair bond wavefunctions are used.)
Ga ,Si,
3. T H E O N E - E L E C T R O N B O N D A N D N O N - P A I R E D S P A T I A L O R B I T A L STRUCTURES
Valence bond structures with one-electron bonds for molecules with two or more electrons may also be constructed. This was done on several occasions during the early 1920's, prior to the development of modern quantum chemistry. Thus Kermack and Robinson [14] used the VB structure 1 of Figure 2,
H
.. c ..
HC ~ . C H .
.
HC
.
.
.
H
H
C,,,fC,~
H
F
9 PF3
~
F
.
9. C H 9C
* 1 C 2 3 H H Figure 2. Kermack-Robinson [14], Linnett [16] and Prideaux [15] VB structures for C6H6 and PFh. which is equivalent to VB structure 2 of this Figure, to represent the electronic structure of C6H6. Using the G-/l: terminology of today, this VB structure has six one-electron C-C n bonds, as well as C-C and C-H electron-pair (~ bonds. K e r m a k and Robinson also devised VB structures with one-electron bonds for a variety of other isolated and reacting molecular species. For PFh, axial one-electron bonds were used by Prideaux [15], as in VB structure 3 of Figure 2. However these contributions appear to have been overlooked, and not incorporated into mainstream VB writings. Much later, VB structures with one-electron bonds were designated as non-paired spatial orbital (npso) structures by Linnett [16], and used extensively by Linnett and his coworkers during the 1960's and early 1970's [16-19]. (The npso structures for relevant molecular systems - for example C6H6 are the same as those that had been proposed by Kermak and Robinson.) Linnett and his coworkers then provided npso wavefunction formulations for numerous
453
systems. For two-electron three-centre, three-electron three-centre and fourelectron three-centre bonding units, the (Kekul~-type) Lewis structures # are those of III-V, and the npso structures are those of V I-VIII.
Y
A~B
<-->Y ~ A
B
IIIa
Y
IIIb
A~B
~->Y ~ A
B
IVa oo
Y
IVb
A~B
oo
<-->Y ~ A
Va
B
Vb
Y.A.B
Y.A.
VI
Y.A.
B
~
Y . A . B VIIb
VIIa
B
VIII
(+)
(+)
(+ 1/2)
H2C-.~-CH---CH2 (--* H2C--.CH~CH2,
(+ 1/2)
H2C.-t.."CH--." OH2
H2C----CH---CH2 ---; H2C--CH--~-CH2, (+ 1/2)
(-1/2)
(-1/2)
( + 1/2)
H2C...~."CH.~" CH2 (--, H2C..Z.."CH---"CH2 (-) oo
(-)
(-1/2)
oo
9
(-1/2) 9
H 2 C ~ C H . - - - C H 2 ~-- H2C--.CH~---CH2, H2C"Z-"CH "-t-CH2 Figure 3. Lewis-Kekul~ and npso structures for C3H5 +, C3H5 and C3H5-. The (primarily)minimal basis set calculations of Linnett and co-workers [19] demonstrate clearly that the npso structures with one-electron bonds # In contrast to w h a t occurs in most of the VB structures of Figures 1-13, atomic formal charges are not indicated in VB structures III-XXXIX. In Figures 1-13, the formal charges are derived on the a s s u m p t i o n t h a t bonding electrons are shared equally by pairs of adjacent atoms, cf.[4].
454 generate a lower energy than does resonance between the Kekul~-type Lewis structures, regardless of the m a n n e r in which the wavefunctions for the electronpair bonds are constructed. The reason for the greater stability of the npso structures is associated primarily with the presence of better electron charge correlation in the npso structures, i.e. the electrons are better separated spatially. The ~ electrons of the allyl cation, radical and anion provide familiar examples of two-electron three-centre, three-electron three-centre and four-electron three-centre bonding units respectively. The associated Kekul~-type Lewis structures and npso structures for these species are displayed in Figure 3.
4. A T H E O R E M The following identity [4-6,20,21], which indicates the invariance of a Slater d e t e r m i n a n t a l wavefunction to a u n i t a r y transformation of a pair of occupied MS = +1/2 spin or MS =-1/2 spin orbitals, will be used on numerous occasions in subsequent sections of this Chapter. If (~1 a n d (~2 are two orbitals (atomic, molecular, overlapping or orthogonal) from which the linear combinations (~ = (~1 + k(~2 and (~* = k*(~l (~2 are constructed, then the Slater determinantal wavefunction of eq.(1)
Icar
{(~(1)a(1)(~*(2)a(2)
-
~*(1)a(1)(~(2)a(2)}/~]2
(1)
is equivalent to eq.(2) -(1 + k k * ) l o l a o 2 a l - - ( 1 +kk*){Ol(1)a(1)O2(2)a(2)-O2(1)a(1)Ol(2)a(2)}/~]2
(2)
in which a is the electron spin wavefunction for ms = +1/2. This identity provides an example of a unitary transformation of a pair of occupied orbitals. The same type of identity of course also obtains when each electron has a ]] (i.e. ms = -1/2) spin w a v e f u n c t i o n . Usually the k* p a r a m e t e r is chosen so t h a t the ~* is orthogonal to the O. 5. THE T H R E E - E L E C T R O N BOND, OR T H R E E - E L E C T R O N H A L F - B O N D
The concept of the three-electron bond was introduced by Pauling in 1931 [1], to provide VB structures for paramagnetic systems such as He2 +, NO, 02 and NO2. Pauling's VB structures for these species are displayed in Figure 4. With two overlapping AOs (a and b) to accommodate the three electrons, the generalized three-electron bond structure IX,
A''*B IX
-
O0
9
A
B Xa
9
O0
B Xb
455
Pauling
Linnett
(He-.-He) +
(-o5I. . (+0.5) N .__.t_r~. 9
9
~9 N
9
%
.o/..'
o
X
9
X
O:
X
9
X
:0 ~--~--0:
: ----0:
0
~
'N
~ % , , , j e
" 0 " ~ " "0: 9
X
(He o He) +
( H e . He) +
"N~@"
9
Linnett(a)
•
9
~-,
•
:b N
(+0
9
~ #N'(+o.5)
(-o.5)
9~
p:9
Nx ~ # N •
,.,p
.9"
Figure 4. Pauling and Linnett three-electron bond VB structures for He2 +, NO, 02 and NO2. (a) Electron spins indicated for the three-electron bond, when MS - +1/2.
is equivalent to resonance between the Lewis VB structures Xa and Xb, with AO configurations (a)2(b) 1 = laaa[tb am and (a)l(b) 2 = laabftba], where it is assumed that the odd electron has ms - +1/2 spin. We may therefore write WIX =WXa + k W X b -
laaa[tbal + klaabftbal -lao~(a + kb)ftbal - laa~abftbal
(3)
in which ~ab - a + kb is an A-B bonding MO. A unitary transformation of the a a and b a spin orbitals that arise in la~176
WIX-
la~176
- Kl~*ab~
generates the identiW of eq.(4), (4)
in which ~*ab - k*a - b is the orthogonal A-B antibonding MO constructed from the a a n d b AOs, and K - -1/(1 + kk*). The identity of eq.(2), which was developed by Linnett, shows t h a t the "three-electron bond" is equivalent to two bonding electrons + one antibonding electron, which generates a m a x i m u m bond
456 order of 0.5 [4-6,16,21,22]. On the basis of the identity of eq.(2), Linnett recommended that the VB structure for the three-electron bond be written as either XI or XH 9
9
A " B XI
"A"
x
B"
x
A o B
XII
XIII
xA o B" XIV
r a t h e r t h a n as IX, thereby showing clearly that this bond involves effectively only one bonding electron. (Because of this result, Linnett [16,21] also indicated t h a t the designation "three-electron bond" in the Pauling sense is a misnomer; however this designation has continued to be used.) When electron spins are indicated by crosses and circles (x = m s = +1/2, o = m s = -1/2), the Linnett representations for the three-electron bond are those of structures XHI or XIV. (Linnett's VB structures for He2 +, NO, 02 and NO2 are displayed in Figure 4.) Alternative designations for the (Pauling) three-electron bond are three-electron two-centre bond/two-orbital bond/hemi-bond/half-bond. Some workers [22] use either of the less-satisfactory VB structures XV or XVI to represent this type of bond.
A'
B
A- " . B X VI
We now list some of the types of paramagnetic molecular systems for which three-electron bonds manifest themselves in their VB structures.
5.1 P a r a m a g n e t i c electron-rich m o l e c u l e s and m o l e c u l a r ions that involve a t o m s of m a i n - g r o u p e l e m e n t s These molecular species provide the most familiar examples of s y s t e m s that involve three-electron bonds. Some examples are displayed in Figure 4. Numerous systems with (N 9 N)(+) and ( S " S)(+) three-electron O bonds have been studied recently [23-25]. 5.2 Hypoligated transition metal c o m p l e x e s , s u c h as high-spin (S = 2) [ F e ( H 2 0 ) 6 ] 2+ In an octahedral field, high-spin Fe 2+ (3d) 6 (S = 2) involves a doublyoccupied t2g AO; the four remaining 3d AOs are singly-occupied with parallel spins for their electrons. In Figure 5, VB structure I for the hypoligated transition metal complex, [Fe(H20)6] 2+ is one of the 15 Kekul6 type Lewis structures t h a t m a y then be constructed for this complex. The high-spin Fe 2+ utilizes 3dx2-y2, 3dz2, 4s and the three 4p AOs to form six octahedral d2sp 3 hybrid AOs. Four of these hybrid AOs are used to coordinate four H20 ligands via Fe-OH2 electron-pair bond formation. The two remaining d2sp 3 AOs are singly-occupied in VB structure 1 . Each of the latter AOs overlaps with an oxygen lone-pair AO (n) on an H20 ligand, to form an (Fe 9 9 6 ) = (Fe ""
6
)~
(Fe "
O) three-electron bond, with the odd""
electron occupying an antibonding Fe-O MO (O$FeO - k*d2sp 3 - n). The resulting three-electron bond VB structures are of type 2 of Figure 5 [26]. These structures
457 may be derived from Lewis structures of type 1 via the delocalization of an oxygen
L
c
L
I
"
!
"k
-..y
9
L
L o
L o
L
k 1
2
t
L
Ix/'x
L/I L >,Fe•
L
3
Figure 5. Valence bond structures for high-spin [Fe(L)6] 2+, with two spin-paired t2g electrons of Fe 2+ not indicated. L = H20. lone-pair electron from each of two ligands into two bonding Fe-O MOs, (OFeO d2sp 3 + kn), as is indicated in the latter structure. Resonance between the 15 VB structures of type 2 provides a VB equivalence for the delocalized MO description of this high-spin hypoligated complex. In the Pauling theory of hypoligated complexes, outer sp3d 2 (d = 4dx2-y 2 and 4dz2) hybrid AOs are used to form six Fe-O electron-pair (~-bonds [27]. However the ((~*FeO')l((~*FeO") 1 --) (4d) 2 promotion energy is considerable, and therefore the Pauling description should represent an excited configuration for the complex. Formosinho and Arnaut [28] have also used three-electron bonds in their descriptions of the electronic structures of transition metal complexes in order to calculate bond-orders of transition states when these systems are involved in electron-transfer reactions.
5.3 F+-type colour centres In alkaline-earth oxides CaO and MgO, F-type defect centres (i.e. anion vacancies with trapped electrons) are responsible for many of the luminescence bands which have been observed [29]. The defects produced by elastic collisions act as traps for free electrons [29]. The trapping may be described as the donation of one or more electrons to a Mg 2+ site which is adjacent to the anion vacancy. The donation of one electron into the magnesium 3s AO, which overlaps with the doubly-occupied 2p(~ AO of an adjacent O 2-, enables an (Mg 9 O)- threeelectron bond to be formed via the (M9 + ~2-) (__) ( ~ g 6-) resonance. The odd electron occupies the antibonding MO (~*MgO -- k*3SMg - 2p(~o MO, and may be dolocalized throughout the crystal via the transfer of it from one antibonding (~*MgO MO into another. (The two electrons of each Mg2+O 2- component may be accommodated in the bonding MO (~*MgO - 3SMg + k2p(~o.) This description of the
458 bonding and electron delocalization may be compared with t h a t for the (H20)solvated electron [6], for which the odd electron occupies an antibonding (~*OH MO of H20-, to give VB structures of the type (HO 9 H)'. The odd electron maybe delocalized into vacant antibonding (~*OH MOs of adjacent H20 molecules [6,30]. 5.4 n - t y p e s e m i c o n d u c t o r s If silicon is doped with arsenic, an n-type semi-conductor is obtained. A
Si-Si electron-pair bond is replaced by an As-Si t h r e e - e l e c t r a bond (As 9 Si), oe
which is equivalent to resonance between the Lewis structures (As
9
Si) and
(As Si). The odd-electron of (As 9 Si) occupies the antibonding (~*AsSi MO. Under the influence of an electrical potential, electron conduction may be initiated via the transfer of the odd electron from the (~*AsSi MO into an overlapping antibonding ($*SiSi MO, as is indicated in the VB structures 3 and 4 of Figure 1 [30]. The odd electron is then transferred into a second (~*SiSi MO, to establish the conduction process. 5.5 C o n d u c t i o n in a l k a l i m e t a l s in the solid state # Pauling has described the process of electron conduction in solid state alkali metals in terms of electron transfer from one metallic orbital into another [31]. The metallic orbitals were a s s u m e d to be valence-shell p AOs. An alternative approach to the phenomenon replaces the p AOs with diatomic antibonding (~*s MOs [32] in the simplest treatment. Because of the existence of a small ionization potential for the alkali metal, the tetraatomic [((~s)l] [((~s)2((~*s)1] configuration with a one-electron bond and a three-electron bond should be energetically close to the [((~s)2] [((~s)2] configuration with two electron-pair bonds. Under the influence of an electrical potential, electron conduction may proceed via one-electron delocalizations, as indicated in structures 5-7 of Figure 1. 6. I N S T A B I L I T Y OF T H R E E - E L E C T R O N B O N D S
Hfickel MO theory with the AO overlap integral Sab included generates an instability for the (homonuclear) three-electron bond when Sab exceeds 0.3 [33], and heteronuclear three-electron bonds are also overlap destabilized under compression. The following speculation, which utilizes the overlap instability of three-electron bonds, may have relevance for the provision of a VB formulation for aspects of the mechanism for electron conduction in the high Tc superconductor YBa2Cu307. One formulation of an A type layer of the high Tc s u p e r c o n d u c t o r YBa2Cu307 involves a ...(CuO)(CuO)+(CuO)(CuO)+... arrangements of copper and oxygen ions [30]. Each (CuO) component involves a three-electron bond, which arises from the overlap of the singly-occupied 3dx2-y2 AO of Cu 2+ with a doubly-occupied 2p(~ AO of O 2-, to give a ((~CuO)2((~*CuO)1 configuration. Each #Although these metals are not paramagnetic, it is convenient to include this phenomenon here.
459 (CuO) + component involves a ((~CuO)2 electron-pair bond configuration. The electron-spins are indicated in structure 1 of Figure 6, and each (CuO)(CuO) + component involves a five-electron four-centre bonding unit, as is also present in an n-type semiconductor (see above) and the solvated electron as H502- [30]. Lattice vibrations of the type indicated in structure 2 involve a compression of the three-electron bonds and expansions of the electron-pair bonds. The resulting instability of the three-electron bond would lead to the transfer of electrons from the singly-occupied antibonding MOs of the (CuO) components into vacant antibonding MOs of the (CuO) + components, as is indicated in 2, to generate (g)2(~,)1 •
•
Cuo
0
((~)2(G,)0
((~)2((~,)1
Cu ,~ 0
Cu • 0
f
(O)2((~,)1
Cu
--->
0
<.--
O
<---
O
Cu o 0
f
(O) (G*) 0
Cu
(5)2(G,)0
((~)2(G ,)1
0
Cu
--->
.--->
~
0
((~)
Cu
)0
0
<---
2
$ 2 - ~( g , ) 0 (O) Cu J
0
~
~ ~1 - - . < (O)2(G,)0 ((5)2(G,)
Cu
-->
0
Cu
<~
<--
0
-->
f
((~)2(O*)1
Cu --~
O <--
3
((~)2((~,) 1
Cu
0
(~)2((~,)0
Cu
---> <--- ~--
0
--->
((~)2((~,) 1
Cu
-->
0
+-
((~)2(g,)0
Cu
+-
0
-->
Figure 6. Valence bond structures, lattice vibrations and electron transfer for .. .(CuO)(CuO)+(CuO)(CuO)+...-'...(CuO) 4- (CuO)(CuO) 4- (CuO).. . .
460 s t r u c t u r e 3. This process leads to a speculation t h a t a m e c h a n i s m for superconductivity m i g h t be associated with this process if a unidirectional flow of electrons is able to be established, i.e. if the next vibrational mode is able to generate structure 4 rather t h a n structure 2. At any stage of a vibration, the electronic wavefunction may be expressed as qJ = tl/2 + ~ttl/3 [30]. Conduction via the solvated electron could also involve this type of mechanism [30]. Because the (CuO) components of s t r u c t u r e 1 are n o n - n e i g h b o u r s , superexchange via i n t e r a c t i o n of 1 w i t h t h e VB s t r u c t u r e for (CuO)+(CuO)(CuO)(CuO) + is needed to couple antiferromagnetically the spins of their antibonding electrons. In contrast the B layers of V S a 2 C u 3 0 7 m a y be formulated as (CuO)(CuO)(CuO)(CuO), and therefore the extent of overlap between neigbouring O*CuO MOs should be sufficient to provide the primary contribution to the antiferromagnetic coupling of their electrons spins. It should be stressed that the mechanism for high Tc superconductivity is usually considered to involve the formation of Cooper pairs, and the speculative conduction mechanism of Figure 6 has not given consideration to the formation of these pairs. 7. THE T H R E E - E L E C T R O N B O N D WITH F O U R OR MORE AOs
The equivalence t h a t exists between the MO and VB resonance descriptions of the three-electron bond is only exact when one AO per centre is used to accommodate the electrons. This has been discussed in some detail by Murrell and Ralston [34], and elaborated further in VB calculations for a variety of systems by Hiberty and coworkers [8(a),35]. However, when four or more AOs are used to accommodate the three electrons, the VB structure XII or XIII may still be associated with an orbital wavefunction, and an extended VB-MO equivalence may be developed [36]. We shall use H2" to demonstrate this result for the simplest case, which involves the use of four AOs to accommodate the electrons. The H-atom and H--ion AOs will be designated a and b, and a' and b' respectively. The resulting S = MS = 1/2 spin wavefunction for resonance between the Lewis VB structures (H H)- and (H H)- is given by eq.(5). W(VB) = W(I~
(H
I~l)- + W(I~I
H)-=
la'aa'f~ba[
+ [aab'ftb'a]
(5)
When an a' electron of the (a')2(b) 1 configuration for the VB structure ~!)- is delocalized into the AB bonding MO ~la'b = a' + kb, to generate the
three-electron bond VB structure (H 9 H)-, one S = M S = +1/2 spin wavefunction for the three electrons is given by eq.(6), WI(a'b~a'b)
= 2 1 a ' a b a ~ a ' b f t l - la'abft~ga'bal - la'ftba~tla'bal
= 21a'aba~ga,bf31-I(a'ab[~ + a'ftba)~ga,bal = 2{(a')l(b)l(s
(6) (7)
= 1, M s = 1 ) } { ( ~ a ' b ) l ( s = 112, M S = -1/2)} -{(a')l(b)l(s
= 1, M S = O ) } { ( ~ a ' b ) l ( s = 1/2, M S = +1/2)}
(8)
461 (9)
= 31a'abaa'ftl + 3kla'ababftl
which is equivalent to each of eqs.(7)-(9). For each of the two ( a ' ) l ( b ) 1 configurations of eq.(8), the electron spins are parallel (S = 1). Because eq.(6) involves an S - 1/2 spin-state, the spins of the two electrons of (a')l(b) 1 must be opposed to t h a t of the ~/ab electron. In eq.(9), the two Slater d e t e r m i n a n t s represent (H H)- and (H H)- VB structures with (a')2(b) 1 and (a')l(b) 2 configurations respectively. When a b' electron of the (a)l(b') 2 configuration for the VB structure (H H)- is delocalized into the bonding. MO ~llb'a - b' + ka, to generate the three-electron bond VB structure (H 9 H)-, the resulting S = M S = 1/2 spin wave-function may be expressed according to eq.(10), (10)
WI(b'a~1b'a) - 31b'aaab'ftl + 3klb'aaaaftl
The linear combination of eq.(11) ~ I = q[I(a'b~ga'b)
(11)
- q[I(b'a~b'a)
= 3{la'(~baa'~l-Ib'aaab'f,I + k(la'ababf, I-Ib'aa(Za[tl)}
(12)
= 3(la'ab(Z~a,b~l
(13)
+ laab'a~b,a~l )
=-3(l~a,b(~1*a,b(Z~lla,b~l
+ I~b,aa~*b,aa~b'a~l)/(1
+ kk*)
(14)
represents a 2E-u state. Eq.(14) provides the MO equivalent to a VB t r e a t m e n t for the three-electron bond when four AOs are used to accommodate the three electrons. The antibonding MOs ~/*a'b - k*a' - b and ~*b'a - k * b ' - a are orthogonal to the bonding MOs ~lla'b and ~ b ' a respectively. The S - M S - 1/2 configuration of eq.(15) Wii(a'b~ta'b)
= l a ' a b ~ l l a ' b a l - la'[3ba~la'b(Zl
(15)
= I(a'ab[ ~ - a'[tba)~a,bal
(16)
= { ( a ' ) l ( b ) l ( s = M S = O)}{Olla'b)l(s = 1/2, M S = +1/2)}
(17)
=
(18)
[a'abaa'f~l-kla'abab[~l
may also be constructed. When k = 1, it is orthogonal to the W I ( a ' b ~ a ' b ) of eq.(9), and involves opposed (S = 0) spins for the two electrons of the (a')l(b) 1 configuration in eq.(17). Similar types of properties apply to the W I I ( b ' a ~ b ' a ) of eq.(19) Wii(b'a~ll b 'a ) - I b ' ( Z a ~ b , a ( Z l -
Ib'~ac~b,aal
(19)
462
= Ib'CLaab'f~l- k l b ' ~ f ~ l It may then be deduced t h a t W I I ( a ' b W a ' b ) expressed according to eq.(21), 9 ii(a'b~ra,b)
(20) + WII(b'aWb'a)
may be
+ Wii(b'aagb,a ) (21)
= (l(I)a'bC~*a'bCtO*a'bftl + ](~b,aaO*b,aad~*b,a[~[)/(1 + kk*)
thereby generating a 2Z+g s t a t e . In eq.(21), d~a'b = k*a' + b , d~*a'b = a' - kb, (~b'a = k*b' + a and O*b,a = b' - ka are sets of orthogonal bonding and antibonding MOs. F u r t h e r development of this type of approach - in p a r t i c u l a r with double-zeta ( a ' ) l ( a " ) 1 and !b')l(b") 1 (and larger basis set) configurations for the H-anions of the (~1 H)- a n d (~1 H)- VB structures - is provided in ref. [36]. Bond polarity when A and B are non-equivalent atoms may also be introduced. In the remainder of this paper, we shall restrict our attention to the two AO formulations of the three-electron bond.
8. T H R E E - E L E C T R O N BONDS AND INCREASED-VAI~ENCE STRUCTURES FOR FOUR-EI~ECTRON THREE-CENTRE BONDING Three-electron bonds are usually associated with paramagnetic systems. However it is not generally appreciated t h a t three-electron bonds may be i n c o r p o r a t e d into the VB s t r u c t u r e s for d i a m a g n e t i c s y s t e m s . This incorporation involves the spin-pairing of the unpaired ~r*ab electron of the three-electron bond structure A 9 S with the unpaired electron of a second radical species Y when the orbitals for the two unpaired electrons overlap [46,37,38]. If the odd-electron of Y occupies the AO y , the singlet (S = 0) spin wavefunction for the four electrons is that of eq.(22), (22)
q~(Y-~'*ab) = ly[~abCtVab[~abCtl + lyaW*ab[]altabaVab~l
=-(1 + kk*){ly~aOta~bal + lyaa[~aab[~l + k(ly[~aab~bal =-(1 + kk*)0PxvII +
+ lyaa~bab[~l)} (23)
which is equivalent to eq.(23), where ~ X V I I a n d ~ X V I I I are the S = 0 spin wavefunctions for the Lewis structures XVII and XVIII.
463
9
Y
O0
A
\0
B
0 ~ 0
Y
XVII
O0
A
9
B
Y
XVIII
O0
A
9
B
XIX
In s t r u c t u r e XVII, a "long" or formal bond links the Y and B atoms. When these atoms are non-adjacent, the overlap between the y and b AOs is extremely small, and the Y-B bond then has negligible strength. Often workers do not indicate the presence of this formal bond. When this bond is omitted, VB structure XIX is obtained, which is designated as a singlet/spin-paired diradical s t r u c t u r e . Inclusion of the formal bond leads to the "long-bond"/formalbond/Dewar/designations for structure XVII. Whichever designation is used to describe s t r u c t u r e X V I I or XIX, this type of structure is usually omitted from qualitative VB descriptions of bonding, and Pauling rarely gave consideration to it. However the results of both semi-empirical [39] and ab initio VB calculations from a variety of laboratories [40-44] indicate t h a t often Lewis structures of this type m a y m a k e significant contributions to the ground-state resonance scheme. This is especially the case when the familiar "Kekul~" Lewis structures of type X V I I I involve formal charge separation, (as occurs in 1,3-dipolar molecules for example), and at least one of the Dewar structures does not. An MO three-centre bond index has been introduced by Giambiagi, de Giambiagi and coworkers [45]. These workers have calculated values of this index (IAc, with A and C as terminal atoms), for a variety of systems t h a t involve fourelectron three-centre bonding units, and have concluded that three-centre bonding indices have appreciable values if and only if there exists a "long" or secondary bond between a pair of non-adjacent atoms. Thus the (STO-6G) value of 0.5812 for the IAC of the 1,3-dipolar molecule N20 [45] is appreciable; this result is in accord with the existence of substantial contributions of "long-bond" (i.e. singlet-diradical) Lewis s t r u c t u r e s to the g r o u n d - s t a t e resonance scheme for this molecule [39(c),40]. Because the three-electron bond structure is equivalent to resonance between structures Xa and Xb, it follows from eq.(3) t h a t resonance between the Lewis structures XIX (or XVII) and XVIII is equivalent to use of the VB structure XX. In XX, a thin Y-A bond line is used to indicate t h a t the Y-A bond-number in this structure is fractional, i.e. its bond-number is less than the value of unity that obtains for the Kekul~ Lewis structure X V I I I [4-6,46]. (The fractionality is a consequence of the absence of a Y-A bond in structure XVII [4-6,46].) Valence bond s t r u c t u r e XX is an example of an increased-valence structure for a fourelectron three-centre bonding unit [4-6,37,38]. It may always be generated from the Lewis structure of type XVIII by delocalizing a non-bonding b electron into the bonding MO ~gab, as is indicated in XXI --~ XX,
---) XXI
9
y
9
AoB
.
XX
t h e r e b y incorporating the three-electron bond into the VB s t r u c t u r e for a triatomic diamagnetic bonding unit. When the p a r a m e t e r k for ~lab is chosen variationally, the one-electron delocalization indicated in XXI always stabilizes the
464 increased-valence structure XX relative to the Kekul6 Lewis structure XVIII [46]. 9. I N C R E A S E D - V A L E N C E
STRUCTURES AND MULLIKEN-DONOR-
ACCEPTOR COMPLEXES
If Y and A : 13 are respectively an n-electron donor (D) and a sacrifical electron acceptor (A) as defined by Mulliken [47], t h e n the Mulliken-type wavefunction of eq.(24) (24)
WN = ~g(D,A) + k~g(D+-A-)
for the ground-state of the (D...A) complex is equivalent to the wavefunction W(f)'A) for the donor-acceptor complex, in which a one-electron bond is formed between the donor and the acceptor [48]. In eq.(24), the S = 0 spin wavefunctions Ig(D,A) and lg(D+-A ") are given by eqs.(25) and (26) respectively.
~g(D,A) = lyay~gabCZ~gab~l
(25)
~(D+=A-) = ly~g*aba~gabB~tabal + lya~g*ab~gaba~ab~l
(26)
For ~g(D,A), the associated VB structure is the "no (D,A) bond" structure XXII. The dative structure (D+-A ") is obtained from XXII via the delocalization of a Y electron of the donor into the vacant A-B antibonding MO Ig*ab - k * a - b of the acceptor. The delocalization proceeds according to XXIII --+ XXlV --~ XX, to generate the increased-valence structure XX for (D+-A').
OO Y A tB XXII
OO~ Y A~B---->~ XXIII
~,.g]
--->~
A.B
XXIV
When an electron is delocalized from the y orbital of the donor into the Y-A bonding MO ~ y a = Y + la, according to XXV --+ XXVI,
At B
9
---> Y " A : B
465 the VB structure XXVI is generated for the (D 9 spin wavefunction is given by eq.(27). ~(D-A)-
complex. The resulting S = 0 (27)
lyft~yaa~abft~abal + lya~yaft~aba~abftl
Valence-bond s t r u c t u r e XXVI is an example of an increased-valence s t r u c t u r e when LMOs are used to accommodate the three (fractional) bonding electrons [38]. Elsewhere, it has been deduced t h a t the A-atom valence is able to exceed unity in each of the VB structures XX [46] and XXVI [38]. Increasedvalence structures of the type XXVI will be used in the discussion of SN2 reactions which is provided in the next section. (To simplify the spin formulation in this section, we have restricted our a t t e n t i o n to the use of the bonding MO ~ a b = a + kb to accommodate the electrons of the A ; B electron-pair bond. Formulations with Coulson-Fischer [49] type MOs - for example, ~l'ab = a + k'b and ~"ba = b + k"a - are provided in refs. [30,38,50]. For the VB structures of Figure 7 below, it will be assumed t h a t the latter MOs are appropriate.)
10. I N C R E A S E D - V A L E N C E S T R U C T U R E S A N D SN2 R E A C T I O N S
9he NU ;
R
enera ize
aso is
acemoot roaction
+ "}~(-) is usually formulated according to Scheme A of Figure 7, in
which a pair of electrons is transferred in concert from the nucleophile NU(') to the s u b s t r a t e R ; X, and s i m u l t a n e o u s l y the pair of R-X bonding electrons is t r a n s f e r r e d in concert from the s u b s t r a t e to the leaving g r o u p ~ ( - ) . A n alternative formulation proceeds according to Scheme B of Figure 7 [30, 38,50], in which the initial step involves the delocalization of one electron from the nucleophile into a Nu-R bonding MO , to form the VB structure 5 for a reactantlike complex with a one-electron Nu-R bond. Electronic reorganization proceeds via one-electron delocalizations to generate VB s t r u c t u r e 6 for a product-like complex with a one-electron R-X bond. Decomposition of the product-like complex via a one-electron transfer generates NU ~ R + ")~(') as products. Consideration has been given elsewhere to the nature of the reactant-like and product-like complexes at the conclusion and commencement of the reaction respectively [30,38,50]]. It has been d e m o n s t r a t e d [38] t h a t s t r u c t u r e 5 generates s t r u c t u r e 8 at the conclusion of the reaction, and t h a t structure 6 generates s t r u c t u r e 9 at the commencement of the reaction, as is indicated in Figure 7. It has also been deduced [38] t h a t the initial step for the reaction m u s t involve a one-electron transfer from the nucleophile to the substrate, r a t h e r t h a n a concerted two-electron transfer. Structures 8 and 9 of Figure 7 involve a three-electron bond for either
(NuR)-or ( R X ) - . F o r
and
these species, the less-satisfactory s t r u c t u r e s I
9
9
(NuIR)-
Q
(R--" X)- have been used in ref.51 instead of (Nu 9 R)- and (R 9X)-.
In refs. [30,38,50], the state correlation diagrams with linear connections between structures 5 and 8, and 6 and 9, are schematic only, and the formation of reactant and product complexes as possible intermediates is not then indicated.
466
+ R
---> 9 [Nu ..... R ...... .X](-)
"N'u(~')+ R ~ X
----> Nu ." R
3 Scheme A
.% ~[(.u 9R
~)
~
(.u
,~~)]~-)
:.
5
Nu
+X(-)
6
. 7
R
+
X(')
d Scheme B
~(~u. ~)~-) + 5
(.u .~
8
9 ~)/-/
~.u
+ (~
6
9
~)(-)
9
Figure 7. Valence bond representations for gas-phase SN2 reaction.
11. T H R E E - E L E C T R O N CENTRE BONDING
B O N D S AND F I V E - E L E C T R O N
THREE-
Five-electron three-centre bonding units involve the distribution of five electrons amongst three overlapping AOs that are located on three atomic centres. The n-electrons of triatomic systems with 19 valence-shell electrons provide examples of these types of bonding units when only valence-shell pn AOs are used to accommodate the electrons. Another example is provided by the indirect interactions of two nitrogen atoms via a hydrogen atom in medium-ring bicyclic compounds [23]. We consider here symmetrical systems, for which there are three canonical Lewis structures, XXVII-XXIX, XO
Y
XO
A
XXVII
X
B
XO
+--> Y
X
A XXVIII
OX
B
~
X
Y
C~V
'A
XXIX
OX
B
467 in which the odd electron is located in either a y or an a or a b AO [52,53]. The AO wavefunctions for these structures are those of eq.(28) ~KVII
=
lyaaabO~y~a~l, ~ K V I I I
=
lyaaabO~y~b~l, WXIX = lyaaabaa[~b[~l (28)
and resonance between structures XXVII-XXIX is equivalent to the formation of the linear combination of eq.(29). = )~(WXXVII + WXXIX) + WXXVIII (29) in which ~, is a variational parameter. This resonance is also equivalent to resonance between the three-electron bond structures XXX and XXXI. x
Y~
x
ox
B
xo
XXX
x
AoB
x
x
,~
YoAoB
XXXI
x
XXXII
The wavefunctions for the latter structures are those of eq.(30).
WXXX -lyc~aaba(Xa + y/2)[tb[~l, WXXXI = [Yaaabayi~O~a + b/2)~l
(30)
In turn, ~ = WXXX + ~FXXXI is also equivalent to: (a) The canonical MO (CMO) configuration of eq.(31), ~CMO
(31)
- Kl~)l(~02a~)3(z~)1~)2[~l
in which the canonical MOs are given by eq.(32). Ol = ~
+ (y + b)/2,
02 = y - b ,
03 = ~ * a - (y + b)/2
(32)
(b) The localized MO (LMO) configuration of eq.(33): (33)
T L M O -lyc~aaba(y + ~ ) ~ ( b + ~)~1 with two non-independent three-electron bonds
(lyaaa(y + ka)~ .... I and
I.... a(Xb(~(b + )~a)~l), for which the associated VB structure is XXXII.
Thus five-electron three-centre bonding units have three different types of VB representations, the wavefunctions for which are equivalent [52,53]. As an example, we display them in Figure 8 for the five pi electrons of s y m C102, when it is assumed that these electrons only occupy pn AOs. Spin-pairing of the ~)3 odd electrons of two C102 molecules generates C1204, for which the a s y m isomer
468
O0
O0
O0
:~/~'x~.._~ .:~S~~~:..~/s'~ dxO
o,,;
x
1
2
O0
(+)
O0
X "
O0
"X 0
5
(.+.)
~
~d'. ,,.
6
..
-:o~c~'~..
.. ~~'~ ~' (-;iO~
,90
X
4
"
3
" X
~
(-1/2) 7
"x
x
"
8
Figure 8. Valence bond structures for C102, with no participation of chlorine 3d AOs as hybridization functions.
o~_o/.?.
{,o. 9 '~
.o7'
9
-~ 9 '--~ 1
2
Figure 9. Increased-valence structures for OC1OC102 [53].
469 OC1OC102 has been identified recently [54]. The resulting (equivalent) VB representations [53] involve either nine Lewis structures, or four increasedvalence structures, each of which involves two three-electron bonds, or one increased-valence structure with four three-electron bonds. The latter increasedvalence structure is displayed in Figure 9, together with an increased-valence structure in which the locations of the three-electron bonds have been interchanged [53].
12. THREE-ELECTRON BONDS AND INCREASED-VALENCE STRUCTURES FOR EXTENDED SIX-ELECTRON FOUR-CENTRE BONDING If two three-electron bond structures are juxtapositioned as in VB structure XXXII, so that their antibonding ~*ab and ~*cd MOs overlap, x
Ao
x
B
0
C •
0
9
A
XXXII
9
9B
,e
C
9
9D
XXXIII
the odd-electrons t h a t occupy these MOs may be spin-paired to generate the increased-valence structure XXXIII for a six-electron four-centre bonding unit with the S = 0 spin wavefunction of eq.(34) [55]. ~XXXIII-
[~aba~abft~cdCt~cdft~*aba~*cdft[ + Illtaballlabftlltcdalllcdftlll*cdOhll*abfti
= (1 + kk*)(1 +
k'k'*)(laa~ab[~Vcdad[tbacftl + IVabaaftda~cdftcabftl)
(34) (35)
This VB structure is equivalent to resonance between the Kekul~-type Lewis structure XXXIV and the Dewar-type Lewis structures XXXV-XXXVII, O0
A
0 ~ 0
B
C
O0
D
9
A
XXXIV oo
A
9
B
O0
C
XXXVI
O0
B
~0
O0
C
D
XXXV ~0
D
0~
A
O0
B
O0
C
~
9
D
XXXVII
and it m a y also be generated from structure XXXIV via the one-electron delocalizations that are indicated in structure XXXVIII [55].
470
9
o ~ O
=A*B X]C~vIII
9
X1~glII
C~% . .b~.~ (+)
No
(-)
C*D
O%Nt, %(+0.5) (-0.s/:~
1
3
,, o,x. /
:5-
:o7.~" 4
/
:oy5
~N 9
C+)
,0
0
N,~>_)
0 N"
f:o:. 9
o~
+
"
7
(4
O
:oX
N
N
x o:
Figure 10. Construction of increased-valence structures for NO2 and N204 from Lewis and three-electron bond structures.
471 Example of increased-valence structures, and the Lewis structures and three-electron bond structures from which they may be derived, are displayed in Figure 10 for NO2 and N204. These two molecules involve four-electron threecentre and five-electron three-centre bonding units. N204 also possesses six-electron four-centre bonding units that are components of a ten-electron six-centre bonding unit. Because of the inclusion of Dewar-type as well as the Kekul~-type s t r u c t u r e s in the Lewis structure resonance scheme, the increased-valence structures are more stable than are the familiar Kekul~-type Lewis structures from which they are derived, provided t h a t the one-electron bond polarity parameters, are chosen variationally. Therefore as discussed already in Section 8, a better (i.e. lower energy) VB description of the bonding may be obtained when increased-valence s t r u c t u r e s r a t h e r t h a n only the component Kekul~-type structures, are used to provide VB reoresentions of electronic structure. On several occasions [55a,c], it has been demonstrated that, when A and D, and B and C are pairs of equivalent atoms, with equivalent overlapping AOs involved in the six-electron four-centre bonding unit, the wavefunction for increased-valence s t r u c t u r e XXXIII is equivalent to the covalent (AB---CD) component of the four-centre canonical MO configuration of eq.(36), ~1 (MO) =
I~lC~l [~2cz~213~t3cz~t3131
(36)
in which the canonical MOs are given by eq.(37). ~1
=a+d+L(b+c) =s3+L~l
~ 2 = a - d + k ( b - c) = s4 + ks2
(37)
~3 = L*(a + d ) - (b + c) = L*s3 - Sl ~ 4 = k * ( a - d) - (b - c) = k ' s 4 -
s2
This r e s u l t is obtained as follows [55] via a series of u n i t a r y transformations of the ~/land ~1/3 canonical MOs of Ol(MO), to give eq.(40).
(38)
~ I ( M O ) = 1~1~113~2~213~3a~3131 =
](s3 + ~Sl)a(s3
+ ~Sl)13(s4 + ks2)~
+ ks2)~5(~,*s3 - Sl)a(~*s3 - Sl)131
= I~'1 [Sl~Slf~(s4 + ks2)~(s4 + ks2)~s3~s3 ~1 = K'I [(s3 + k s l ) a ( s 3 + ksl)~5(s4 + k s 2 ) a ( s 4 + k s 2 ) ~ ( k * s 3 - S l ) a ( k * s 3 - Sl)131 =I~ ~~(~ab+~1cd)~(~1ab+~cd)~(~1ab-~1cd)~(~ab-~1cd)~(~*ab+~*cd)~(~*ab+~1*cd)~
=K1 [~taba~llabfS~llcda~llcd~(~ll*ab+~l*cd)~
=Kl(l~ab~ab~cda~Ccd~*aba~*cd~l
+V*cd)13[
+ I~aba~ab~cda~cd~*cda~*ab~l
472 + IXllabCtXlIab~XtlcdaXltcdfJXll*aba~*abf3l + Ixtlaba~abfJXtlcdCt~cd~Xll*cdCt~*cd~l)
(39)
= Kl(Wcov + Wion)
(40)
in which the LMOs of eq.(41) ~ a b - a + kb, ~ll*ab = k * a - b, ~llcd = d + kc, "~l*cd = k * d -
c
(41)
are two-centre A-B and C-D bonding and antibonding MOs. The qJcov of eq.(40) corresponds to the sum of the first two determinants of eq.(39), and is equivalent to the wavefunction for increased-valence structure XXXIII. The MO configuration of lowest energy that will interact with ~l(MO) is the t~2(MO) of eq.(42) O2(MO) = I~l/1a~l/113~2a~213~4a~4131
(42)
in which two electrons of ~1 (MO) have been excited from ~3 into ~4. To simplify the presentation here, ~ is set equal to k. Unitary transformations of the ~2 and ~4 MOs of ~2(MO) then generates eq.(43). O2(MO) -
I~fflr
=IC 2 1 ( ~ a b + ~ l c d ) a ( ~ a b + ~ c d ) f J ( ~ a b - ~ c d ) a ( ~ a b - ~ c d ) f J ( l l l * a b - ~ * c d ) a ( ~ * a b - ~ l t * c d ) ~ l = K21~abCt~ltab[JXllcdOt~llcd~(~*ab
- ~*cd)Ct(~ll*ab - Ilt*cd)~l
= K2(qJion - Wcov)
(43)
Configuration interaction (CI), via eq.(44), gives eq.(45), 9 ( C I ) - CIOI(MO) + C202(M0)
(44)
= (C1K1 - C2K2)Wcov + (C1K1 + C2K2)Wion
(45)
and with C2 < 0 when C1 > 0 for the lower energy linear combination [55], the importance of Wcov is increased relative to qJion. Therefore the dominant contributor to the (S = 0 spin) MOCI wavefunction for a symmetrical six-electron four-centre bonding unit involves two three-electron bond configurations with their odd-electrons spin-paired. Additional excited configurations that generate induction and dispersion interactions contribute to eq.(45) when ~, ~ k. These are examined in refs.[55(a)],
473 but they are not needed here. For an analysis of the MOCI wavefunction of a nonsymmetrical six-electron four-centre bonding unit, see ref. [56]. 13. T H R E E - E L E C T R O N BONDS AND INCREASED-VALENCE S T R U C T U R E S FOR CYCLIC SIX-ELECTRON FOUR-CENTRE B O N D I N G
In the increased-valence structures of Figure 9 for OC1OC102, there are cyclic six-electron four-centre bonding units, and consideration will now be given to the three-electron bond theory for them when these bonding units have either D4h or D2h symmetry. Examples of these systems [57] are provided by (a) the It electrons of S42+; (b) (i) the rt electrons of S2N2 and (ii) six g electrons that are involved in the intermolecular bonding between the monomers of the OC1OC102, I42+, $6N42+ and [(C2F5)2Se2122+ dimers. It is convenient to define the four-centre MOs according to eq.(46), Igl = a + ~ ~2 = a
+ c + ~d
- c , ~3 = b
~4 = n*a
- b
= ~tab + ~ c d
= ~ad
+ ~lbc
- d
+ n* c - d
(46) = ~*ab
+ I]l*cd = 1]l*ad + ~ l * b c
These MOs are canonical MOs for $2N2 and the central C1OC10 moiety of OC1OC102, when K is chosen variationally. For the six-electron four-centre components of the remaining species, ~: = K:* = 1 in the canonical MOs ~1 and /]/4, and/]/2 - / g 3 replace/]/2 and I]/3 as canonical MOs.
A~
I
A.--~
i
DoC
D~
XXXVIiI
XXXlX
Because ~/4 is antibonding with respect to each pair of adjacent atoms, the lowest-energy configuration is given by eq.(47) Ol(MO) ~ I ~ g l a ~ l ~ / 2 a ~ g 2 ~ / 3 ~ 3 ~ l -
](~gl)2(~2)2(~/3)21
(47)
By a series of unitary transformations of the occupied MOs, Ol(MO) of eq.(47) may be transformed to give to eq.(48) O l ( M O ) - / C 1 ( 1 ( ~ 1 ) 2 ( ~ 2 + K~F3)2(~*~F2- ~3)21 + ](~l)2(K*/g2 + ~3)2(~/2 -
~g3)21)/2
474 = K' 1(l(~ab + ~cd)20]lab - ~cd)2(~*ab - ~*cd)2[
+ I(~ad + ~/bc)2(lltad - ~lbc)2(~*ad - ~*bc)2l)/2 = K1 ([(~ab)2(~cd)2(~*ab- Ilt*cd)2l + I(~ad)2(~bc)2(~*ad_ ~*bc)21]2 - K1 (-q~cov + qJion- W'cov + ~F'ion)/2
(48)
in which the ~Fcov, Wion, ~F'cov and W'ion are defined according to eq.(49).
qJcov-
[~abagtab~cdCt~cd~*aba~*cd~l+l~abagtab~cda~cd~*cda~*ab~l
qJion -
Illtaballtab~lltcdalllcd~lll*ab allt*ab~ l+ IVabaVab~VcdaVcd~V*cdaV*cd~ l
W'eov =
IgtadaWad~bc aWbc~*adagt *bc~ l+ I~ada~ad~bc a~bc~* bcagt*ad~ l(4 9 )
W' i on -
I~adaYad~YbcaYbc~Y* aday *~ l+ lyadaYad~YbcaYbc~Y* bcaY *bc~ l
The qJcov and W'cov, each of which involves two three-electron bond configurations, are the wavefunctions for the increased-valence structures XXXVlII and XXXIX. Configuration interaction via eq.(50), W(MOCI) = CI~I(MO) + C2q~2(MO) + C3q~2(MO)
(5O)
with ~2(MO) and ~3(MO) defined according to eqs.(51) and (52) 9 2(MO) = I(yl)2(~2 + Kalt3)2(~4)21
(51)
= (l(~ab + ~cd)2(~ab -~cd)2(~*ab + ~*cd)21 = K2(qJcov + qJion ) r
1(~1)2(~2- K:~3)2(~4)21
(52)
= I(gtad + II/bc)2(ll/ad - Iltbc)2(llt*ad -~*bc)2[ = K3(W'cov + W'ion ) The W(MOCI) of eq.(50) is then equivalent to eq.(53) qJ(MOCI) = {(-C1K1 + C2K2)~Fcov + (C1K1 + C2K2)qJion + (-C1K1 + C3K3)qJ'cov + (C1K1 + C3K3)W'ion)}]2
(53)
475 for which C2 < 0 and C3 < 0 when C1 > 0. Thus according to eq.(53), the MOCI wavefunction of eq.(50) increases the importance of Wcov and W'cov relative to Wion and W'ion. As does eq.(45), eq.(53) indicates that spin-pairing of the unpaired electrons of two three-electron bond structures is the primary process that occurs when their orbitals overlap.
14. T H R E E - E L E C T R O N RESONANCE
BONDS
AND
COVALENT-IONIC
There are two types of covalent-ionic resonance. These involve electronpair bonds(A B <--> A- B +e->A + B- = A ' B <--> A ~ and threeelectron bonds (A B ~-> A" B + = A " B) respectively[58]. The origins of the rotation barrier for N204 and the antiferromagnetism of Cu(II) carboxylate dimers provide well-studied examples of the latter [58-60]. For each case, a sixemmm~
9
O0
ee
9
R
o
o
c
Figure 11. Relevant AOs for rotation barrier and antiferromagnetism studies of N204 and Cu(II) carboxylate dimers.
Z
--'"x
"
"
.,
eO
1
o**No.
2
**N 3
5o
0o/
4
~
Figure 12. Relevant ONNO components of primary Lewis-type VB structures needed for a VB rationalization of the origin of the rotation barrier for the D2h isomer of N204.
476 electron four-centre bonding unit is present [6,55]. The relevant AOs are displayed in Figure 11. It has been calculated that resonance [58,59] between the covalent (NO2-NO2) and the ionic (NO2+NO2 - and NO2-NO2 +) structures of the types 1-4 of Figure 12, are primarily responsible for the planarity of the D2h isomer of N204.This type of resonance establishes stabilizing O-O three-electron bond 9
Gig
oo
interactions of the type ( 0 0 <--->O" 0 + - 0 9 O) in the planar conformer via a non-zero value of the 0 - 0 overlap integral for the cis oxygen AOs. Because the value of this overlap integral is zero in the perpendicular conformer, no cis 0 - 0 overlap stabilization of this conformer occurs. For the Cu(II) carboxylate dimers, resonance between the covalent
Ol Cuo"
S S
Oo,,
-" 1
".Cu
3
oo 9 Cut 8
ol Cu"
~Cu
lO
"-.
Cul
o0
Cu-
|
Cu:
(~)
~ oCu
Cut
tCu
o0 2
Oo9
to
"o 7
Oo
|
:Cu
oO
~i|
4
ot
oCu
oo 9 (~ 6
~ Q
to |
9 oCu"
|
8
tCu
Figure 13. Primary Lewis-type VB structures required for a VB rationalization of the origin of the antiferromagnetism of Cu2(RCO2)4,Ln.
477 (CuO--- CuO) and ionic (CuO+CuO" and CuO-CuO +) VB structures (Figure 13) of the types 1-4 for the S = 0 spin state, and 5-8 for the S = 1 spin state both generate an O-O three-electron bond. However whereas the covalent structures 1, 3, 5 and 7 are almost degenerate, the energies for the S = 1 spin ionic structures 6 and 8 lie well above those for the S = 0 spin ionic structures 2 and 4. As a consequence, the stabilization energy that arises from covalent-ionic resonance is less for the S = 1 spin state than it is for the S = 0 spin state, and therefore an antiferromagnetic alignment of electron spins occurs in the ground-state [60]. Another example of covalent-ionic resonance is provided by VB studies of N-N dimers of HNO [61]. It has been calculated t h a t the barrier to rotation around an N-N bond involves a substantial contribution from three-electron bond ..
covalent-ionic resonance of the type
-'-
(--> ~4+---N-), as well as from
electron-pair bond covalent-ionic resonance ( ~ 4 ~ N ~ N = - ' - N + ~ N+----N-).'" The results of recent MO studies are considered to be in accord with this VB study [62]. For HCONH2, the development of some N-C three-electron bonding ( N - - C ~-> N+---C -) in the planar conformer has been calculated to make a contribution to the rotation barrier of this molecule [63]. 15. C O N C L U S I O N S The concept of the three-electron bond was introduced in Pauling's 1931 papers [1]. Although Pauling did not have the three-electron bond in mind, he considered [64] t h a t sometimes he was most proud of his first 1931 paper [1], which "changed the nature of chemistry in a significant way" [64]. As has been demonstrated here, the incorporation of three-electron bonds into mainstream VB theory provides a significant change in the m a n n e r in which qualitative VB descriptions of electron-rich molecules are formulated. The above review of a selection of aspects of three-electron bond theory, with particular attention given to increased-valence structures, provides a further demonstration of the aphorism that the "three-electron bond probes the ultimate limit of valence for electron-rich molecules" [65]. REFERENCES 1. L. Pauling, J. Am. Chem. Soc. 53 (1931) 1367, 3225. 2. L. Pauling, The Nature of the Chemical Bond, 3rd Edition, Cornell University Press (1960) Chapter 10. 3. (a) T. Kiang and R.N. Zare, J. Am. Chem. Soc. 102 (1980) 4024; (b) A.R. Gregory and V. Malatesta, J. Org. Chem. 45 (1980) 122 ; (c) P.M.W. Gill and L. Radom, J. Am. Chem. Soc. 110 (1988) 4931. 4. R.D. Harcourt, Qualitative Valence-Bond Descriptions of Electron-Rich Molecules: Pauling "3-Electron Bonds" and "Increased-Valence" Theory; Lecture Notes in Chemistry, Volume 30, (Springer, Berlin, 1982). 5. R.D. Harcourt, Chem. & Eng. N e w s , 56, October 3, p.5 (1988). 6. R.D. Harcourt, in Valence Bond Theory and Chemical Structure, (Eds. D.J. Klein and N. Trinajsti~, Elsevier, Amsterdam, 1990) p. 251. See also ref. 7 below for some other accounts of this approach.
478 7(a) A. van der Putten, A. Elzing, W. Visscher, E. Barendrecht and R.D. Harcourt, J. Mol. Struct.(Theochem) 180 (1988) 309. (b) S.J. Formosinho, in Theoretical and Computational Models for Organic Chemistry, (Edts. S.J. Formosinho, I.G. Csizmadia and L.G. Arnaut, Kluwer, Dordrecht, 1991) p. 159. (c) T.M. KlapStke and A. Schulz, Quantenmechanische Methoden in der Hauptgruppenchemie, Spektrum 1996, p. 200. 8.(a) P.C. Hiberty, S. Humbel and P. Archirel, J. Phys. Chem. 98 (1994) 11697 ; (b) P.C. Hiberty, S. Humbel, D. Danovich and S. Shaik, J. Am. Chem. Soc. 117 (1995) 9003 . For some other recent references, see also (c). D.K. Maity, H. Mohan, S. Chattopadhyay, and J.P. Mittal, J. Phys. Chem. 99 (1995) 12195 and refs. 17 and 22 therein. (d) Y. Deng, A.J. Illies, M.A. James, M.L. McKee and M. Peschke, J. Am. Chem. Soc. 117 (1995) 420, and refs. 1-32 therein. (e). M.A. James, M.L. McKee and A.J. Illies, J. Am. Chem. Soc. 118 (1996) 7376. 9. L. Pauling, Chem. Revs. 5 (1928) 173. 10. K. Ruedenberg, Revs. Mod. Phys. 34 (1962) 326. 11. R.D. Harcourt, Am. J. Phys. 56 (1988) 660. 12. (a) M.J. Feinberg, K. Ruedenberg and E.L. Mehler, in Advances in Quantum Chemistry, (Ed. P.-O. LSwdin, Academic Press, New York) 5 (1970) 27. (b) M.J. Feinberg and K. Ruedenberg, J. Chem. Phys. 54 (1971) 1495; (c)K. Ruedenberg, in Localization and Delocalization in Quantum Chemistry, Volume 1, (Eds. O. Chalvet et al., Reidel, Dordrecht Holland, 1975) p. 223. 13. M.J. Feinberg and K. Ruedenberg, J. Chem. Phys. 55 (1971) 5804. 14. W.O. Kermak and W. Robinson, J. Chem. Soc. (1923) 432. 15.E.B.R. Prideaux, Chem. and Ind. 42 (1923) 672. See also R.D. Harcourt, J. Chem. Soc. Faraday Trans. 88 (1992) 1119 for VB alternatives to double and triple bonds that involve one-electron bonds. 16. J.W. Linnett, (a) J. Amer. Chem. Soc. 83 (1961) 2643. (b) The Electronic Structures of Molecules (Methuen,London,1964). (c) Sci. Prog.(Oxford) 60 (1972) 1. 17. R.D. Harcourt and D. Jordan, Specul. Sci. Tech. 3 (1980) 77, 612. 18. (a) R.A. Firestone, Tetrahedron, 33 (1977) 3009. (b) R.F. Langer, J.E. Trenholm and J.S. Wasson, Canad. J. Chem. 58 (1980) 760. (c) W.B. Jensen, Canad. J. Chem. 59 (1981) 807. (d) G. Leroy in Advances in Quantum Chemistry, 17 (1985) 1. (e) B.J. Duke, J. Mol. Struct.(Theochem) 152 (1987) 319. (f) R.D. Harcourt and R.D. Little, J. Am. Chem. Soc. 106 (1984) 41 and references therein 19. See for example: (a) J.W. Linnett and coworkers as referenced in ref. 19(b). (b) R.D. Harcourt and A.G. Harcourt, J. Chem. Soc.Faraday Trans. II, 70 (1974) 743. (c) D.M. Hirst, J.Chem. Soc. Faraday Trans. II, 73 (1977) 422. (d) C. Amovilli, R.D. Harcourt and R. McWeeny, Chem. Phys. Letts. 187 (1991) 494. 20. J. A. Pople, Quart. Revs. 11 (1957) 273. 21. (a) J.W. Linnett, J. Chem. Soc. (1956) 275. (b) Canad. J. Chem. 36 (1958) 54. (c) M. Green and J.W. Linnett, J. Chem. Soc. (1960) 4959. 22. R.D. Harcourt, J. Chem. Ed. 62 (1985) 99. 23. R. Alder, Tetrahedron, 46 (1990) 682. 24. S.A. Chaudhri, H. Mohan, E. Anklam and K.-D. Asmus, J. Chem. Soc. Perkin 2 (1996) 383 and refs. 1-36 therein. 25. T. Clark, J. Am. Chem. Soc. 110 (1988) 1672. 26. (a) R.D. Harcourt and G.R. Scollary, Inorg. Nucl. Chem. Letts. 11 (1975) 821. (b). R.D. Harcourt, ref.4, Chapter 5. 27. L. Pauling, ref. 2, Chapter 5. 28. S.J. Formosinho and L.G. Arnaut, J. Photochem. Photobiol. A: 82 (1994) 11.
479 29. K.J. Caulfield and R. Cooper, J. Am. Ceramics Soc. 78 (1995) 1054. 30. R.D. Harcourt, J. Mol. Struct. (Theochem) 229 (1991) 39. 31. L. Pauling, (a) ref.2, p. 400; (b) J. Solid State Chem. 54 (1984) 297.(c) L. Pauling and Z. S. Herman in in Valence Bond Theory and Chemical Structure, (Eds. D.J. Klein and N. Trinajstic~, Elsevier, Amsterdam, 1990) p. 569. 32. R.D. Harcourt, J. Phys. B, 7 (1974) L41. 33. (a) N.C. Baird, J. Chem. Ed. 54 (1977) 291. (b) N.C. Baird, Pure Appl. Chem. 49 (1977) 223. Applications of three-electron bond theory to N-H bond dissociation energies are described in refs. (a) and (b). (c) R.D. Harcourt, Aust. J. Chem. 31 (1978) 199. (d) R.D. Harcourt, ref. 4, Chapter 3 and refs. therein. 34. J.R. Murrell and J. Ralston, J. Chem. Soc.Faraday Trans. II, 70 (1974) 2004. 35. C. Byrman and J.H. van Lenthe, Int. J. Quantum Chem., 58 (1996) 351. 36. R.D. Harcourt, J. Phys. Chem. A 101 (1997) 2496. Replace (a) (1 + kk*) with (1 + kk*) -1 in eq.(1), and (b) "used 7'' with "used 18'' in the 2nd column. Insert "H2like" prior to "internuclear separations" in line 14 of the first paragraph on p.2500. 37. R.D. Harcourt, (a) Biopolymers 11 (1972) 1551. (b) J. Mol. Struct. 12, 1 (1972) 1, (corrig. 13 (1973) 585). 38. R.D. Harcourt, Int. J. Quantum. Chem. 60 (1996) 553. 39. R.D. Harcourt, (a) Theor. Chim. Acta 6 (1966) 131. (b) Int. J. Quantum Chem. 4 (1970) 173. (c) R.D. Harcourt and J.F. Sillitoe, Aust. J. Chem. 27 (1974) 691. 40. R.D. Harcourt and N. Hall, J. Mol. Struct. (Theochem) 342 (1995) (corrig. 59, 369 (1996) 217), and refs. 15, 31, 36-39, 61, and 66-70 therein.. 41. R.M. Parrondo, P. Karafiloglou, R.R. Papparlado and E. S~inchez, J. Phys. Chem. 99 (1995) 6461. 42. W.B. Floriano, S.R. Blaszkowski and M.A.C. Nascimento, J. Mol. Struct. (Theochem) 335 (1995) 51. 43. R.G.A.R. Maclagan, Aust. J. Chem. 41 (1988) 527. 44. P.C. Hiberty in Valence Bond Theory and Chemical Structure, (Eds. D.J. Klein and N. Trinajstic, Elsevier 1990) p. 221. 45. M.S. de Giambiagi, M. Giambiagi and F.E. Jorge, Z. Naturforsch. A, 39 (1994) 1259 and refs. therein. 46. R.D. Harcourt, J. Mol. Struct. (Theochem) 300 (1993) 245. 47. R.S. Mulliken, J. Chim. Phys. 61 (1964) 20. 48. R.D. Harcourt, Aust. J. Chem. 28 (1975) 881. 49. C.A. Coulson and I. Fischer, Phil. Mag. 40 (1949) 396. 50. R.D. Harcourt, (a) J. Mol. Struct. (Theochem) 253 (1992) 363. (b) New J. Chem. 16 (1992) 667. (c) J. Mol. Struct. (Theochem) (1997) in press. See also ref. 30, and R.D. Harcourt and R. Ng, J. Phys. Chem. 97 (1993) 12210, (corrig. 98 (1994) 3226) for application to the exchange reaction X~ RY --->XR + Y: 51. (a) S. Shaik and A.D. Pross, Accts. Chem. Res. 16 (1983) 343. (b) S. Shaik and A.D. Reddy, J. Chem. Soc. Faraday Trans. 90 (1994) 1631. (c). S. Shaik and P.C. Hiberty in Advances in Quantum Chemistry, 27 (1995) 99. 52 .R.D. Harcourt, J. Chem. Soc. Faraday Trans. 87 (1991) 1089. 53. R.D. Harcourt, J. Phys. Chem. 97 (1993) 1351. 54. A. Rehr and M. Jansen, Inorg. Chem. 31 (1992) 4740. 55. R.D. Harcourt, (a) J. Am. Chem. Soc. 102 (1980) 5195, (corrig. 103 (1981) 5623). (b) J. Phys.Chem. 95 (1991) 6916. (c) Croat.Chem.Acta, 64 (1991) 399. 56. R.D. Harcourt, T.M. KlapStke and P.S. White, Inorg. Chim. Acta (1997) in press. 58. R.D. Harcourt, Chem. Phys. Letts. 218 (1994) 175 and refs. 18 and 20 therein.. 59. R.D. Harcourt and F.L. Skrezenek, J. Phys. Chem. 96 (1990) 1351.
480
60. (a) R.D. Harcourt, F.L. Skrezenek and R.G.A.R. Maclagan, J. Am. Chem. Soc. 108 (1986) 5403. (b) R.D. Harcourt, ref. 4. 61. R.D. Harcourt, F.L. Skrezenek and B.G. Gowenlock, J. Mol. Struct. (Theochem) 284 (1993) 87. See ref. 58(b) for some corrections. 62. R. Glaser, R.K. Murmann and C.L. Barnes, J. Org. Chem. 61 (1996) 1047. 63. R.H. Flegg and R.D. Harcourt, J. Mol. Struct. (Theochem) 164 (1988) 87. 64. G.G. Kauffman and L.M. Kauffman, J. Chem. Ed. 73 (1996) 29. 65. F. Williams, Chem. Eng. News, 57 (1989) J a n u a r y 9, p.2. 66. M.L. McKee, J. Am. Chem. Soc. 117 (1995) 1629.
ADDENDUM:
VB REPRESENTION
F O R 2 N O + 0 2 ---) 2 N O 2
On the basis of an ab initio MO study of the potential energy surface for N204, McKee [66] has provided computational evidence for the existence of a new N204 isomer (ONOONO) with 1A symmetry in the C2 point group. In Figure 14, a VB representation is provided for the formation and decomposition of this isomer via the m e c h a n i s m NO + NO + 0 2 - - > O N . . . O 2 . . . N O ( t s ) ---> O N O O N O -~ ONO .... ONO(ts) --->2NO2 for the gas-phase oxidation of NO [66]. The formation of the ON...O2...NO occurs primarily via the spin-pairing of the antibonding n*x and n*y electrons of ground-state 02 with the antibonding n* electron of each groundstate NO. The three-electron bond VB structures of Figure 4 are used to represent the r e a c t a n t s . The electronic reorganization leads to the formation of the increased-valence structures of type 5 of Figure 10 to represent the NO2 products.
:O~N" 0
"
o " O -~~ N o "
0
oO
,-,o ~,.,jo
oo
O0
_ o , - , ~ -9 %,-,o o%J ~o
~
_Q-~o,-,___.~,o o%i ~..lo---.~ - -
/\
oN~Oo
9
.,$.
-')
~,~.% ,,~"
"-
.N~Oo 9
;O~N
. " O m~ m m"N
9
9 (,-oN
~
oo
/
/
"Q O ~ N "
gO
oO ~ O O
+
o0"
~
oNtO
"0
O0
t
O0 9
TM
k
0 ~
.N---O.~ O0
Figure 14. VB representation for gas-phase oxidation of NO to NO2. Convenient 2-dimensional VB representations are used here. The bond-lengths implied by these VB structures are mainly in qualitative accord with those reported in ref. 66.
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond
Theoretical and Computational Chemistry,Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
481
Valence bond description of ~r-electron systems J. Paldus a* and X. Li aDepartment of Applied Mathematics, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada We overview our valence bond (VB) approach to the 1r-electron Pariser-Parr-Pople (PPP) model Hami!tonians referred to as the PPP-VB method. It is based on the concept of overlap enhanced atomic orbitals (OEAOs) that characterizes modern ab initio VB methods and employs the techniques afforded by the Clifford algebra unitary group approach (CAUGA) to carry out actual computations. We present a sample of previous results, as well as some new ones, to illustrate the ability of the PPP-VB method to provide a highly correlated description of the ,r-electron PPP model systems, while relying on conceptually very simple wave functions that involve only a few covalent structures. 1. I N T R O D U C T I O N The pioneering paper by Heitler and London [1], providing the first genuine explanation of "the nature of the chemical bond" based on "new" quantum theory, as well as their subsequent work explaining basic valency rules [2] and the crucial role played by the symmetry of the wave function as implied by Pauli's exclusion principle [3], are rightfully considered as the birth of the new field of quantum chemistry. This work decidedly pointed the way to a proper physical understanding of chemical binding, unmistakably indicating that chemical bonds are fundamentally electric in nature and relegating the proponents of the "chemical force" to obscurity, and thus stimulated a fervent search for theoretical methods that would enable at least a qualitative handling of molecular systems with more than one or two electrons. Only a few months after the publication of Heitler and London's paper, these results were reported by Van Vleck at the 1928 St. Louis meeting of the American Chemical Society (cf. [4]). Concluding his splendid expos6 on the "New Quantum Mechanics", he asks: "Is it too optimistic to hazard the opinion that this is perhaps the beginnings of a science of 'mathematical chemistry' in which chemical heats of reaction are calculated by quantum mechanics just as are the spectroscopic frequencies of the physicist?", and confidently reckons: "Of course the mathematics will be laborious and involved, and the results always successive approximations. The theoretical computer of molecular energy levels must have a technique comparable with that of a mathematical astronomer. The quantum mechanics is still very young, and surely it will ultimately be applied further than the hydrogen molecule" [5]. This prophecy was amply fulfilled already a decade *Also st: Department of Chemistry and Guelph-Waterl0o Center for Graduate Work in Chemistry, Waterloo Campus, University of Waterloo, Waterloo, Ontario N2L 3(31, Canada
482 later, as witnessed by the very first textbook bearing the title of "Quantum Chemistry" [6], even though its full development had to await the modern computer era. One of the young scientists at the time who was very clearly aware of these developments and of their potential, and who became one of the key players by greatly contributing to the rapid expansion and success of this new field, was our 'hero of honor', Linus Pauling. In the same volume of Chemical Reviews where Van Vleck's beautiful review appears, we also find a superb review by Pauling [7] of the work on H2 and Hu + by Burrau [8], Heitler, London [1] and others, containing several original developments of his own. It was only a short step from here that led Pauling to apply these ideas to a multitude of molecular systems and to co-develop a general approach usually referred to today as the valence bond (VB) or Heitler-London-Pauling-Slater (H-L-P-S method). Although the appropriateness of the label H-L-P-S was questioned [9], and the important contributions by other scientists, notably those by Kimball and Eyring [10], Rumer [11], Weyl [12], Wheland [13] and others are undeniable, Pauling's work certainly stands out for its thoroughness, generality and both its systematic and exhaustive pursuit of the subject. In his classical series on "the nature of the chemical bond" that first appeared in the Journal of American Chemical Society [14] and later in the newly started Journal of Chemical Physics [15], he quickly progressed from one- and three-electron bonds to the concept of resonance in conjugated systems, while at the same time producing (with E.B. Wilson) an excellent textbook on quantum mechanics [16] that introduced generations of chemists and physicists to the subject, and apparently holds the record for continuous publication without modification by its publisher [9]. A very important ingredient in those developments was Pauling's concept of hybridization, originally referred to as "change in quantization" or "8-p quantization breaking through bond formation" [14,17]. Similar ideas, though in a more limited form [18], were later presented by $1ater [19], who referred to hybrids as "concentrated bond functions". Almost simultaneously with the VB approach (or, in fact, what we would now call the AO based minimum basis set full configuration interaction method, as formulated by $1ater [20]), the molecular orbital (MO) method was developed by Hund [21], LennardJones [22], Herzberg [23], Mulliken [24], and others (see [25] for an early review). Initially, however, this theory merely provided labels for various electronic states of diatomics (via united and separated atom parentages) that were required in the analysis of their spectra, and was not regarded as suitable for quantitative or semi-quantitative calculations of binding energies, particularly when the MOs were represented as linear combinations of AOs (LCAO approximation) [25]. At the same time, the relationship and the relative advantages of VB and MO theories were becoming well understood, as were the possible avenues for their improvement (cf., e.g. [26]), but in view of the drastic simplifications that were made when evaluating the relevant integrals and the necessary computational limitations of the time, neither method was regarded as "any too good" [25]. At a later date, however, it was the MO approach that carried the day. The main reason for its success was not only its conceptual simplicity, but also the mathematical convenience of the orbital orthogonality that drastically simplified any quantitative computations. Only relatively recently, and thanks to our better understanding of suitable AO-like basis sets [30], has the VB approach, with its appeal to a chemist's intuition and its direct link with Lewis' electron pairing ideas started to enjoy a definite renaissance
483 (see, e.g. [27-30]). Present day computing technology enables one to carry out very accurate calculations of various molecular properties, at least for relatively small systems. Nonetheless, one invariably runs into limitations when trying to extend the available methodology and software to larger and larger molecules. In this respect we have to keep in mind that already simple qualitative and semi-quantitative considerations, pioneered by Pauling and his contemporaries, generated a wealth of information and provided understanding of numerous chemical phenomena. This was particularly the case for organic molecules with conjugated double bonds, whose many interesting properties derive from their ,r-electronic structure. These developments culminated by the introduction of the Pariser-Parr-Pople (PPP) and related model Hamiltonians [31], which continue to play an important role to this day, particularly for large and extended systems, and which we consider next. 2. P P P - T Y P E
HAMILTONIANS
Despite the tremendous progress in quantum chemical methodology and especially in computer technology, alluded to in the Introduction, the complexities of molecular electronic structure pose a continual challenge to theoretical chemists. Even though we can presently perform very accurate and reliable calculations to obtain numerous properties of smaller molecules, the exigencies posed by larger systems involving heavier elements beyond the first two rows of the periodic table, especially the transition metals, as well as various organic or biologically important molecules, are likely to remain with us for some time. Some authors even questioned the appropriateness of current ab initio approaches for very large systems of biological significance on epistemological grounds [32]. It must be emphasized that in fact a very similar situation characterizes experimental investigations, since very different experimental techniques are employed, and different questions are asked, when investigating simple diatomics on the one hand and complex systems, such as chlorophyl, proteins, fullerenes, etc. on the other. Even if one could resolve and measure, for instance, all the individual to-vibrational lines in the electronic spectra of large systems involving, say, more than 50 carbon atoms, the usefulness of such information is debatable. Likewise, our theoretical inquiries call for different models, approximations and techniques when handling different types of problems. Successful model building is at the very heart of modern science. It has been most successful in physics but, with the advent of quantum mechanics, great inroads have been made in the modelling of various chemical properties and phenomena as well, even though it may be difficult, if not impossible, to provide a precise definition of certain qualitative chemical concepts, often very useful ones, such as electronegativity, aromaticity and the like. Nonetheless, all successful models are invariably based on the atomic hypothesis and quantum mechanics. The majority, be they of the ab initio or semiempirical type, is defined via an appropriate non-relativistic, Born-Oppenheimer electronic Hamiltonian on some finite-dimensional subspace of the pertinent Hilbert or Fock space. Consequently, they are most appropriately expressed in terms of the second quantization formalism, or even unitary group formalism (see, e.g. [33]). In the ab initio case, the relevant finite-dimensional one-electron space is spanned by a minimum, double-zeta, double-zeta plus polarization, etc., basis set and further approxi-
484 mations are then invoked at the N-electron level. In contrast, semiempirical Hamiltonians involve free parameters representing various matrix elements of certain parts of the ab initio Hamiltonian in an unspecified, hypothetical minimum basis set, whose value is determined either through calibration by experiment or by employing additional simplifying assumptions when their number is large. These models can thus be regarded as interpolation or extrapolation schemes based on the general form of the simplest possible quantum mechanical description. One of the most successful semiempirical Hamiltonians that also served as a prototype for many other models is undoubtedly the PPP model of planar ,r-electron systems with conjugated double bonds [31]. Originally designed as a generalization of the tight-binding Hiickel Hamiltonian that would account more realistically for the Coulomb interelectronic repulsion, it eventually evolved into a very useful tool enabling the prediction and rationalization of molecular geometries, electronic, NMR, and ESR spectra, chemical reactivity and other properties of chemical interest. It also served for numerous theoretical developments, even though on a smaller scale than simpler models as represented by the Hubbard, Ising, Heisenberg, and various spin-Hamiltonians, whose simplicity often enables a thorough mathematical analysis without any recourse to numerical computations. These simple models can even provide an understanding of certain puzzling features of the more complex PPP model [34,35], and the similarity in the structure of corresponding wave functions can be very striking [36], even though they may be inadequate for quantitative predictions concerning the excited state manifold. Thus, in contrast to the Hubbard Hamiltonian, the PPP model can provide a realistic description of low lying excited states of planar conjugated systems [31,37]. Moreover, in view of the tight-binding character of its one-electron part, the PPP Hamiltonian is often invariant under topological transformations defining the so-called alternancy symmetry [38]. This approximate symmetry property is absent from any ab initio Hamiltonian, yet provides a good quantum number, and thus effective selection rules, that are particularly useful in multiphoton, circular dichroic and other spectroscopies [39-42]. The key premise of the PPP ,r-electron models, later heuristically extended to allelectron models (CNDO, MNDO, etc.), is that of the zero-differential overlap (ZDO) [43], which drastically reduces the number of required two-electron integrals, leaving only those of the Coulomb type "/m. = (Pvle2/r12]Pv) 9 In contrast to the Hubbard model which considers only on-site interactions, 7.~ = 7..6m, the PPP model accounts for the long-range nature of the Coulomb force. To avoid a multitude of adjustable parameters, one chooses a suitable approximation scheme for these integrals. Approximating the on site ~,. integral via the so-called I-A approximation [44], following earlier developments by Goeppert-Meyer and Sldar [45], one evaluates the remaining Coulomb integrals -y~,~ using a suitably modified Coulomb Law. Presently, the most often used approximation is that of Mataga and Nishimoto (MN) [46], modifying the simple point charge Coulomb interaction so that finite on-site interactions 7 ~ = I s , - A., given by the difference of the valence state ionization potential (I.) and electron affinity (A v), are reproduced when both sites coalesce, i.e.
7m(Rm)
=
e'l(a. +
=
2e2/(7..
with R m, designating the intersite separation.
(1) Thus, only the nearest neighbor one-
485 electron resonance integrals flv~ are left as free parameters. For hydrocarbons one usually employs the spectroscopic value flw = ]5 = -2.4 eV. When the C-C separation significantly differs from the standard equilibrium bond length do = 1.4 ~, as when exploring various distorted structures, we employ the so-called "Mulliken magic formula" for the resonance integrals fl(R); (cf. Eq. (6) of [47]). We can thus conveniently express the standard PPP Hamiltonian in a modified (particle number preserving) second quantized form H. =
+ l~,b'
1
-
(2)
I$,V
where E.,, designate the orbital unitary group generators [33] (or the so-called replacement operators [48]) associated with the hypothetical 2p. carbon AO basis IX~,). Furthermore, z~,v = av -
~
2~7v~;
zv~ =
{fl~,~ 0
if ~u,v are nearest neighbors otherwise, '
(3)
where ct~, is a so-called (one-electron) Coulomb integral, usually approximated by the corresponding valence state ionization potential I~,, and Z~, is the number of 1r-electrons contributed by the/~-th atomic site. For hydrocarbons (Z, = 1, a , = a) we can define the energy scale so that ct~, - a - 0, 1 /J,v
the first sum extending over nearest neighbors only [33]. We also recall that the second quantization realization of U(n) generators has the form E,,. =
xLx
. ,
(5)
o"
where the creation (annihilation) operator
X~(X~,,,)
is associated with the spinorbital
=
As already mentioned, this simple model is capable of describing, among other things, the main features of visible and near-UV electronic spectra for a multitude of 1r-electron systems. The usual implementation of this model relies on the MO formalism, obtaining first the self-consistent field (SCF) MOs, followed by a very limited singly-excited CI (SCI), at least when considering the low lying, electric dipole allowed transitions (the so-called a, p and fl bands [37]). It was later realized, particularly when information concerning dipole forbidden transitions became available thanks to multiphoton or lowenergy electron impact spectroscopies [39-42], that one has to go beyond the TammDancoff or SCI approximation if one is to obtain at least a correct ordering of the low-lying excited states. Particularly for the so-called alternant systems (whose structural formula is a bigraph, see e.g. [38]), whose states may be labelled by an additional quantum number (the so-called plus "+" or minus " - " states of Pariser [49]) providing additional selection rules, a qualitatively different behavior is found for states of "+" and " - " parity when introducing the correlation effects beyond the Tamm-Dancoff level of approximation: the singlet minus states undergo almost an order of magnitude larger lowering when the effect
486 of doubly excited configurations (via SDCI) is introduced as compared to the singlet plus states (and, similarly, though to a lesser degree, the triplet plus states are more affected than the triplet minus states) [50]. While this behavior may seem puzzling at the MO level, it is easy to understand from the VB viewpoint [50]. It has been well known since the early days of quantum chemistry [25] that the MO formalism overestimates the contribution of ionic structures at the expense of covalent ones: already for a simple homopolar diatomic, such as H2, the MO wave function consists of an equal mixture of the covalent and ionic structures, and the weight of the former is further diminished when going to polyatomic molecules. As a result, the role of doubly excited configurations is to reintroduce the correct balance by emphasizing the weight of covalent structures. Now, one can show [50] that in alternant systems the VB covalent structures are always of the minus type for singlets and of the plus type for triplets, while the ionic structures can be of either kind. Consequently, in view of the superfluity of ionic terms at the uncorrelated MO level, it will be the singlet minus states (and, correspondingly, the triplet plus states) that are most strongly influenced by the SDCI, since there are no covalent structures of the singlet plus or triplet minus type. The effect is smaller for triplet states, since they are reasonably correlated already at the SCI level. Thus, simple VB considerations enable us to understand the behavior of the MO model and to formulate simple qualitative rules describing the role of correlation effects. We shall see in the following that a simple VB picture can, in fact, provide very efficient and highly correlated results for the PPP model in general. 3. P P P - V B M O D E L As already indicated, the overemphasis of covalent structures in the VB wave function that is built from AOs, and of ionic structures in the MO wave function, as represented by a single antisymmetrized product of LCAO MOs, has been known since the inception of both approaches [25], as well as the way - at least in principle - how to remedy these shortcomings by proceeding towards the full CI (FCI) or FVB limit, where both procedures coalesce, thereby yielding the exact solution in the finite dimensional subspace defined by the chosen AO set. In this latter procedure, the coefficients associated with distinct MO configurations or VB structures (both covalent and ionic) are varied independently, relying on the variation principle, and the transition from one formalism to the other is achieved using a similarity transformation between the two distinct basis sets spanning the same N-electron space. It is thus not difficult to realize that one could go a long way towards the FVB limit if, during the construction of VB wave functions for relevant covalent structures, one would employ instead of pure, strictly localized AOs, their linear combinations, not unlike the LCAO MOs, while preserving their AO character by allowing only a moderate "delocalization" to neighboring sites. This idea was first exploited by Coulson and Fischer [51] in their study of the hydrogen molecule, using the Heitler-London or VB-type trim wave function in which each AO contained an admixture of the AO on the other nucleus, the mixing parameter being treated as a variational parameter. This idea was later employed by Goddard and collaborators [27,52] in their generalized VB (GVB) method, by Cooper, Gerrat, Raimondi and oth-
487 ers in their spin-coupled VB method [28,53], or its mnlticonfigurational generalizations [54,551 and other "modern VB" approaches [56-581, as wen as by McWeeny [30] who also provided the most convincing and pedagogical illustration of this idea using the example of the H2 dimer [59]. The relevant LCAO-type AOs, whose admixture or mixing coefficients are optimized so that the VB wave function, as represented by the most important covalent structure(s), gives the lowest energy, are usually referred to as the overlap enhanced atomic orbitals (OEAOs). It is essential that these orbitals be nonorthogonal. As a consequence, the N! problem, typical of VB approaches, is still present. Nonetheless, finding the optimal OEAOs leads to highly correlated wave functions involving only a small number of covalent structures that possess all the essential bonding characteristics of the ground or low lying excited states. The quantitative chemical accuracy, if desired, may then be achieved with a moderate VB expansion involving a few hundred up to a few thousand additional structures constructed from the same OEAO set [28]. The key to the success of the spin-coupled VB is thus undoubtedly the exploitation of a flexible LCAOtype AO basis set, whose mnlticenter orbitals are suitably delocalized in the vicinity of each site. We have recently exploited this idea in the context of the PPP model with considerable success [60-65] (see also [66,67]). To preserve the simplicity of the model, we minimize the number of mixing parameters defining the OEAOs. Thus, starting from the hypothetical AOs [Xu) of the PPP model, we construct the OEAOs [r by admixing to each AO [Xu) only the AO's [Xv) on the nearest neighboring sites. Moreover, unless the bond lengths involved are very different, we assume all mixing coefficients to be the same. Thus, in its unnormalized form we write
Ir
=
Ix.> +
~(~u)
Ix ),
(6)
where the symbol ",,~" indicates that the sum extends only over the nearest neighbors of /,. Such an OEAO basis set is referred to as the {bl} basis [60]. The unnormalized form has the advantage that the relative mixing parameter is not dependent on the number of nearest neighbors. The value of the mixing parameter e is then determined by minimizing the total energy. A higher approximation may be achieved by employing two-parameter OEAOs which result from the admixture of the next nearest neighbor AOs (mixing parameter #) as well as the nearest neighbor (mixing parameter r ) ones. Such a basis set is referred to as the {bl, 2} basis, and we can similarly define {bl, 3}, {bl, 2, 3}, and other basis sets. However, it turns out that little is gained in this way [60,61], and that in most applications the {bl} basis is perfectly satisfactory. An important feature of a successful semiempirical approach is the transferability of parameters from one system to another. Although it may sometimes be advantageous to employ different parameters for different homologous classes or states of different multiplicity, etc., the ideal semiempirical parametrization should remain as universal as possible. We found that the {bl} basis is satisfactory in this regard, being transferable from one system to another regardless of the system's topological character (alternant vs. nonalternant systems), spin multiplicity (singlets, doublets, triplets, etc.), or electric charge (neutral species vs. ions). The optimal value of e ranges from ,,~ 0.25 for cyclobutadiene
488 to ~ 0.34 for benzene, the average value being ~ 0.31 [60]. In fact, optimal ~ values for most systems range from 0.30 to 0.32. Moreover, the energy change that results when using the average value ~(ave) = 0.31 rather than the optimal one, ~(opt), for the mixing parameter ~, never exceeds 0.13 eV (the value for cyclobutadiene) and on the average amounts to less than 0.01 eV. We must emphasize that in the PPP-VB calculations referred to above, we employed all the covalent structures for a given system. However, these results remain practically unchanged when only Kekul6 structures are used. Merely in cases when only one Kekul6 structure exists is it advisable to employ one or more additional non-Keknl6 structures. Let us also mention that the actual computations are based on the Clifford algebra unitary group approach (CAUGA) [48,68] (of. also [69]) and employ the Clifford algebra realization of the VB Rumer-Weyl basis [66,70]. We first construct the CAUGA VB states or structures using the just described OEAO basis set(s). The original version of our codes transforms these VB states into CAUGA states in terms of the orthonormal PPP AO basis, since the action of U(n) generators in this basis is very simple. A subsequent and more efllciene algorithm, based on Eqs. (23)-(26) of [70], which uses the fact that the action of the standard "orthogonal" U(n) generators E ~ , defining the P P P Hamiltonian (2) or (4), on the nonorthogonal OEAO basis, say {bl}, simply produces the mixing coefficient in lieu of the Kronecker delta [60]. After determining the result of the action of H, on a given VB structure, we calculate the overlap with the other structures expressed in terms of bispinors or Slater determinants. We also exploit the fact that the overlap matrix is in block diagonal form relative to the distinct orbital occupancies, so that the similarity transformation to the orthogonal basis may be easily carried out block by block. The codes can employ any number of VB structures, both covalent or ionic, up to and including FVB. The latter is often helpful in evaluating the performance and reliability of the PPP-VB approach that is invariably based on a highly truncated N-electron VB basis, usually restricted to only K~kul6 structures. Our codes are also capable of carrying out a population analysis of the resulting VB wave functions in terms of structural weights or bond orders. 4. A P P L I C A T I O N S The primary objective of most applications carried out so far was to assess the performance of the PPP-VB method for diverse alternant and nonalternant ~r-electron systems of aromatic, nonaromatic or antiaromatic character, both electrically neutral and charged. The main emphasis was on ground states of different spin multiplicity, even though some preliminary calculations were also carried out for excited states. The PPP-VB codes were also employed to provide the approximate three- and four-body connected cluster components for the so-called VB-corrected coupled cluster (CC) approach [71]. In the following, we briefly point out the most important aspects of the PPP-VB method and illustrate them with a few typical results. 4.1. C o r r e l a t e d g r o u n d s t a t e s , basis t r a n s f e r a b i l i t y a n d t h e role of ionic structures To assess the effectiveness of the PPP-VB method in describing the correlated ground states of various 1r-electron systems, we have examined a set of 70 relatively small molecules
489 with 4 to 8 1r-electrons, for which we can easily generate the exact FCI or FVB wave functions and energies [60]. It is instructive to first consider the simplest possible examples of aromatic, antiaromatic, and nonaromatic systems represented by benzene, cyclobutadiene (CBD), and linear polyenes (say trans-butadiene and all-trans-hexatriene), respectively. Benzene, epitomizing aromatic systems, distinguishes itself by its high stability, symmetry and reactivity, while CBD exemplifies antiaromatic systems, is highly unstable and its most symmetric square configuration undergoes the Jahn-Teller distortion, resulting in the equilibrium geometry with highly pronounced bond length alternation (see, e.g. [73]). The nonaromatic polyenes then display a moderate bond length alternation which decreases with the increasing size of" the system (see Sect. 4.3). To examine the equilibrium geometry, we also have to consider the a-framework (see also Sec. 4.3). This is simply achieved at this level of approximation by assuming a strictly additive character for individual bonds, so that
(7)
=
i
where the i-th bond relative a-energy contribution AE~,~(Pu) as a function of the bond length P~ is approximated by a suitable harmonic or 3rd order anhaxmonic oscillator =
-
+
-
R,)
.
(8)
The parameters R.,~2 and ~3 are then obtained by fitting the experimental geometry and the totally symmetric stretching mode frequencies of benzene (cf. [47,61,65]). Thus, invoking the usual a-lr separability of the PPP model, the relative total energy is given by the sum AE (t~ = AE~ + A E , .
(9)
No angular deformation potential is needed for our type of problems, in which only a small bond length alternation is assumed with the C-C-C angles preserved. Designating the longer and shorter bond lengths by d+ and d_, respectively, we assume that d+ + d_ = 2d0 (taking do - 1.444/~ for CBD and 1.4/~ in all other cases). We can then characterize the distorted structures by a single parameter A, 1
a = ~(d+ - d_).
(10)
Considering now the PPP-VB model with only two covalent structures (Kekul$ structures for cyclic systems), we find that the resulting potential energy curves as a function of the distortion parameter A are almost identical with the exact FCI or FVB curves (see Fig. 4 of [61]). Only for benzene do we obtain a symmetric, equidistant geometry (A = 0), all other systems displaying bond length alternation (A ~ 0). The largest distortion and the smallest stabilization energy (relative to the symmetric, undistorted structure) are found for CBD (A ~ 0 . 0 5 - 0.06/~, AE ('t~b) ~ 0.025 eV), while for linear polyenes the distortion is smaller (A ~ 0.04/~) and the stabilization energy larger (AE('t~b)/bond ~ 0.05 eV). These are very reasonable values considering the simplicity
of the model (cf. [61]).
490 We also examined [60] the amount of the correlation energy that is recovered by a simple PPP-VB method involving only one or two Keknl~ structures, considering the whole range of the coupling constant (i.e., varying fl from 0 to 5 or 10 eV). Of course, different coupling constants require different optimal mixing parameters e (clearly e -+ 0 as fl --+ 0 since each covalent structure represents the exact solution in this fully correlated limit [33]). For CBD, the {bl} basis already leads to very precise wave functions, and the error in the correlation energy never exceeds 0.1%. For benzene with the spectroscopic parametrization (fl = -2.4 eV), we recover about 76% of the correlation energy using the {bl} basis and over 90% with any 2-parameter basis. The results become progressively poorer as fl increases (for fl = - 1 0 eV, we recover over 83% of the correlation energy using the {bl, 3} basis). Thus, as expected, MO and VB approaches complement one another, becoming exact in each other's limiting cases (MO in the uncorrelated Jill--+ oo limit and VB in the fully correlated fl -+ 0 limit). Remarkably enough, the VB approach works reasonably well even in the weakly correlated limit (where MO is exact), and the mixing parameter e changes very slowly, especially in the physical region of the coupling constant. This is a good indication of the transferability of the OEAO basis from one system to another, thus avoiding reoptimization in each case. Indeed, for a class of 20 typical systems we found that for the physical values of the coupling constant, all optimal values of e fall within the interval 0.25-0.35, the most extreme boundary values corresponding to CBD and benzene. For most systems this range is much smaller, from 0.30 to 0.32, with an average value e(ave) ~ 0.31. Most importantly, when employing e(~ve) = 0.31, the maximum energy deviation relative to the optimized value is only 0.13 eV (for CBD). Excluding CBD, the average energy difference for the 20 systems considered is only 0.01 eV, indicating that the average OEAO basis is transferable with a high degree of accuracy. Consider, finally, the role of other than Kekul~ structures. For benzene, the energy lowering due to the Dewar structures is very small (usually less than 1% of the correlation energy). The ionic structures are most important when the simplest OEAO basis {bl} is used, and significantly less with two-parameter basis sets. The energy improvement is largest for the so-called "asymmetric" structures (see [60]), while the effect of symmetric ones is marginal (~ 1%). Clearly, a very different situation is encountered for the excited states (see See. 4.4). We can thus conclude that the PPP-VB method, employing a simple one- or twoparameter OEAO basis and relying on a few covalent (mostly Kekul~ type) structures, provides an excellent approximation that accounts for a large part of correlation effects. 4.2. Spin properties To explore the applicability of the PPP-VB method to high spin states we investigated conjugated 1r-electron systems of various topology and character [59]. We were interested in these states not only because of their importance in materials science (see, e.g. [7375]), but also from the methodological viewpoint, since in contrast to the MO based approaches, which generally require a different methodology when investigating the low spin closed shells and high spin open shells, we can use the same PPP-VB method for all cases.
For alternant Kekul6 hydrocarbons (i.e., those having at least one Kekul6 structure)
491 with an even number of sites, the transferability of the OEAO basis is once again large: the average value of ~ for a set of 19 hydrocarbons that we examined was 0.303 for singlets and 0.312 for triplets. Thus, the earlier found (see Sec. 4.1) optimal value e(ave) _ 0.31 can be safely used in both cases. As expected by either Longuett-Higgins' [76] or Ovchinnikov's [77] rule, all these systems have a singlet ground state. Moreover, the singlet-triplet splittings are well reproduced quantitatively already with the average {bl} basis and a few important covalent structures. In most cases, the difference between the exact FVB and PPP-VB splittings amounts to less than 0.05 eV, the largest error occurring for benzene (0.28 eV) and ethylbenzene (0.12 eV). When the two-parameter {bl, 3} basis is used for benzene, the error reduces to 0.04 eV. Similarly for alternant Kekul6 ,r-systems with an odd number of sites, we find the average mixing parameter for doublets and quartets to be 0.309 and 0.311, respectively. Thus, using an average value of 0.31 leads to negligible changes in doublet-quartet separations (less than 0.01 eV). Longuett-I-Iiggins' and Ovchinnikov's rules apply again (doublet ground state) and the exact FVB splittings are well reproduced by the PPP-VB method. Both singlet-triplet and doublet-quartet splittings for these systems are significant, exc ~ n g 1 eV in most cases (see [64] for details). Very similar results are also found for non-Kekul6 alternant hydrocarbons. In this case, however, it is the high spin state that has the lowest energy and the singlet-triplet or doublet-quartet separations that are much smaller, ranging between 0.05 and 1 eV. Again, a simple PPP-VB method, involving only the most important covalent structures and the average {bl} basis (e(ave) = 0.31), provides an excellent approximation in the 12 cases that we examined [64]. The most important structures are always those involving the maximal number of double bonds (i.e., Kekul6-like). When truncating the set of these maximally covalent structures one must ensure that any two adjacent sites are It-bonded in at least one retained structure, lest the bonding be r priori biased by truncation. Another interesting test case is provided by alternant non-Kekul6 hydrocarbons having the same number of starred and nonstarred atoms, as represented by a prototypical tetramethyleneethane (TME) molecule. Here Ovckinnikov's and Longuett-Higgins' rule give different predictions for the multiplicity of the ground state (i.e., singlet and triplet, respectively). The actual calculations show that all systems which we examined [64] have a singlet ground state, even though the singlet-triplet separation is very small (about 0.1 eV). Once again, the PPP-VB results are very close to the FVB ones. The proximity of the low and high spin states in these systems is easily understood from the VB viewpoint: the transition from singlet to triplet involves the breaking of a weak, long "bond" between nonadjacent sites in their leading structures. For nonalternant systems (involving cycles with an odd number of sites) Ovchinnikov's rule does not apply. By analogy with alternant systems we can expect, however, that the existence or nonexistence of at least one Kekul6 structure will play an important role: Kekul6 systems should have a low spin ground state. We examined about a dozen of such systems involving 3, 5 or 7 membered rings. The average mixing parameters for the low and high spin states are 0.316 and 0.320, respectively. Again, using an overall average value of 0.31 has a minimal effect on the computed energies and the PPP-VB values are close to the FVB ones. The singlet-triplet (or doublet-quartet for systems with an odd
492 number of sites) separations are quite large for Kekul6 systems (ranging from 0.8 to 3.6 eV) and the above indicated rule holds in all cases that we examined (a low spin ground state). On the other hand, non-Kekul6 nonalternant systems have a high spin ground state and the singlet-triplet separations are much smaller (0.1 to 0.4 eV). We can thus conclude that states of different spin multiplicity (singlets, doublets, triplets, quartets, etc.) of very diverse ,r-electron systems (Kekul6 or non-Kekul6, alternant or nonalternant, aromatic, nonaromatic or antiaromatic) can be satisfactorily described by the PPP-VB method with a severely truncated set of covalent or maximally covalent structures using the same simple OEAO basis set {bl}. In contrast, the MO description requires a different handling of closed and open shell cases and the amount of correlation recovered in states of different multiplicity may be rather unbalanced. 4.3. E l e c t r o n delocalization, r e s o n a n c e a n d b o n d l e n g t h a l t e r n a t i o n The very concept of resonance originated in VB theory and was extensively exploited in qualitative explanations of stability and other properties of ,r-electron systems [78], surviving vitriolic attacks by some Soviet philosophers of the time. Recent re-examination of these ideas [61,65,79-82], based on new semiempirical and ab initio results, shows that the concept of resonance or ,r-electron delocalization is very subtle and prone to misinterpretation when improperly isolated. When we split the relative total energy of benzene (or, in fact, of any cyclic polyene CNHN w i t h a nondegenerate ground state, N = 4v + 2, v = 1,2,--.) into its ~r and lr components, Eq. (9), we observe that it is the ~-energy that stabilizes the symmetric, equidistant structure of a regular polygon, while the ,r-energy component favors the distorted, bond length alternating geometry (see, e.g. [41]). This is easy to understand when we compare the ,r-electron component with, for example, a chain of N = 2n H atoms [83]: the optimal regular polygonal structure will preferably dissociate into n H2 molecules rather than into 2n H atoms, the energy gain being n-times the dissociation energy of H2. Likewise, the *r-electrons prefer a bond length alternating geometry, "crystalizing" to form a Wigner lattice. Now, for N = 6 or even N = 10, the ~r-energy takes over and a regular polygonal equilibrium geometry results. For large cyclic polyenes (and, similarly, for an infinite linear polyenic chain) it is the 1r-component that will dominate, leading to bond length alternation. The magnitude of this alternation as measured by A, Eq. (10), increases with increasing N and stabilizes as N -+ co at A ~ 0.049 /~ (similar results are obtained with MO based approaches as well [41,84,85]). This value agrees reasonably well with the experimentally observed bond length alternation in all-trans-polyacetylene (/~(exp) ,~ 0.052/~) films [86]. The corresponding stabilization energy (AE('tab)/N ~, 0.05 eV) is also reasonably close to the correlated ab initio calibrated result (0.08 eV) [87] (cf.,
[85]). These facts, however, do not imply that the *r-electron resonance energy (for different definitions see, e.g., [88]), associated with *r-electron delocalization, should be regarded as "a byproduct of the a-imposed geometric symmetry" (see [?9] for this and similar statements). To isolate this effect we have to split AE,~ into its "localized" 1r-electron component A E (LvB), given by the sum of the ~r-electron energies of non-interacting ethylenic fragments, and its true "delocalization" or resonance energy A E (P'zs), i.e. ~,E(t~
-- ~ E r -~- h ~ = -- ~,~r + A~(LVB) -~- A~(RBS).
(11)
493 When we plot these various components as a function of the distortion parameter A, we find that each component AE(=Lvs) and AE(~P~s) changes much more appreciably with increasing A than does their sum ABe, or in fact AE.. Of course, the least rapid change is found for the total energy AB (t~ due to the opposing tendencies of A E . and AE~ (and, AE~(P~s) and AE~(LvB), see Fig. 2 of [65]). This analysis shows the essential stabilizing role of the resonance energy, which must be sufficiently strong to overcome the dimerization tendency of AE~ Lw~. Indeed, (AE~ + &E~Lvs~) by itself would produce a bond length alternating structure even for benzene. In fact, for larger cyclic polyenes, the resonance energy is not sufficiently strong to overcome the dimerization tendency of ( A E . + AE(LVS)) and the bond length alternation sets in, as already pointed out. 4.4. E x c i t e d s t a t e s
So far we carried out only a very preliminary study of the excited states [62] for four typical ,r-electron systems: CBT, benzene, hexatriene and naphthalene. In all cases we found that the PPP-VB method can successfully describe the low lying excited states using only a small subset of VB structures. For states of a predominantly covalent character, the same OEAO basis as for the ground state can be used. However, for higher lying states with significant ionic character, a re-optimized OEAO basis is preferable. Thus, for example, using an optimized basis of {bl} quality and only the most important singly ionic structures (in addition to covalent ones) for benzene, we find for the 1B2~, tB2o, tBx~ and 1Bl~ excitation energies values of 3.64, 5.93, 6.90 and 7.09 eV, which differ by less than 2% from the exact FCI values of 3.70, 5.96, 6.77 and 6.99 eV, respectively. Similarly, for all-trans-hexatriene the difference between the PPP-VB and FCI excitation energies are always smaller than 2% (for transitions to both singlet and triplet states). In the naphthalene case, we considered only covalent type transitions and again all errors in the excitation energies relative to FCI never exceeded 4%. We thus believe that the PPP-VB approach represents a viable and useful alternative even for the excited states which, to this day, were invariably treated by MO methods. Particularly in photochemical processes, the insight afforded by the VB formalism may be very useful. 4.5. V B c o r r e c t e d coupled cluster m e t h o d The MO based single reference coupled cluster (CC) approach truncated at the paircluster level (CCSD) [89] is well known to provide a reliable description of correlation effects in nondegenerate ground states (see, e.g. [90]). The size-extensive character of this approach makes it a method of choice when considering small and medium size systems. However, since the MO reference often becomes quasidegenerate when considering nonequilibrium geometries, the negligibility of the 3- and 4-body connected clusters which is essential for a satisfactory performance of the CCSD method - no longer holds. Consequently, the CCSD description deteriorates in such cases or even completely breaks down [91,92]. For accurate calculations, 3-body clusters should be accounted for even when no quasidegeneracy is present (see, e.g. [90,93]). Unfortunately, an explicit account of these clusters (CCSDT method) is computationally very demanding. Since the electronic Hamiltonian involves at most two-body interactions, the energy is fully determined by 1- and 2-body cluster components. The latter can then be obtained by solving the CCSD equations. These equations, however, arise through decoupling of
494 the rest of the full CC chain by neglecting the 3- and 4-body connected dusters. Thus, were we to know these cluster components from some independent source, the CCSD equations corrected for the contributions of 3- and 4-body clusters would yield the exact energies. Of course, the exact values of these higher order clusters can only be found by cluster analysis of the exact (i.e., FCI) wave function. For practical applications, on the other hand, a reasonable estimate of these clusters is perfectly satisfactory. Such an estimate can be obtained from wave functions that account, at least approximately, for these clusters, such as the UHF [92,94,95], VB [67,71], CAS-SCF [96,97], etc. wave functions. This is precisely the idea behind our VB corrected CCSD method [67,71], since even very simple VB wave functions, involving only a few covalent structures, can often provide a good estimate of the desired higher order clusters in view of the complementary character to MO methods (and in spite of their inability to provide good 1- and 2-body cluster components). We have tested this idea using PPP model Hamiltonians, obtaining the 3- and 4-body clusters by cluster analyzing simple PPP-VB wave functions. We carefully explored a number of typical systems for the whole range of the coupling constant. Clearly, it is the highly correlated limit (/3 ~ 0), where the standard CCSD fails, but where PPP-VB works best, that we obtain excellent results. Perhaps the most severe test of the proposed scheme was encountered in our study of the radicaloid mode of benzene dissociation or, equivalently, a recombination of two allylic radicals [67]. Since the separated limit involves open shell subsystems, standard CCSD approaches an incorrect channel. However, when VB corrected CCSD is used, we obtain practically the FCI result. 4.6. Ionization p o t e n t i a l s and electron affinities As a final illustration of the PPP-VB method we present some new results on the ionization potentials (IPs) and electron afHnities (EAs) of cyclic polyenes C~rH~ with an odd number of sites N - 2n + 1, n - 1, 2,-... We compute these quantities as the difference between the energy of the appropriate ion and of the parent molecule. The IPs and EAs obtained with the PPP-VB method employing both the optimal and average {bl} basis are compared with exact FCI results (wherever available) in Fig. 1 and Table 1. The degeneracy (indicated by the type of the pertinent symmetry species) of relevant states is also given in Table 1. We note that the symmetry of the lowest state, for either the neutral or charged polyenes, resulting from our simple VB approach involving only N covalent structures, is always the same as that obtained with the FCI method: the ground state of neutral polyenes is always degenerate (E-type symmetry species), while the ground states of cations and anions alternate between the nondegenerate A-type states and degenerate E-type states (see Table 1). All PPP-VB calculations reported here employ N covalent structures involving n double bonds and either an empty, singly occupied or doubly occupied remaining site. For greater clarity, the standard PPP-VB results using the average mixing parameter e(~*e) - 0.31 are connected via straight lines in Fig. 1. Again, we find that results obtained using optimized and average OEAOs are almost indistinguishable, and are very close to FCI results, as shown for N _~ 9. Note that for N -- 11 the dimensions of the relevant FCI problems are 60 984 for the anion and cation, and 104 544 for the neutral system.
495
I
I
i
I
I
i
I
_
'
9
IP (FCI)
IP (VB, E (~
" t
5
> v
r
~
IP (VB, ~(ave))
i
\ 4
LU
o
:~ /~
',,/ \ t
E A (FCI)
A -
EA (VB, ~ (opt)) EA (VB, r
"
,-',, \
s
~
/_\
/.,~
--
,,
I 11
I 13
I 15
3
2
I 3
I 5
I 7
I 9
N
Figure 1. The IPs and EAs obtained with the PPP-VB method, employing both the optimal (opt) and average (ave) {bl} OEAO basis and involving N VB structures, are compared with the exact FCI results.
We observe a typical alternating behavior of the IPs and EAs for these systems. The magnitude of these "oscillations" decreases quite rapidly with an increasing number of sites in the cycle N. Clearly, this behavior reflects the stability of Hfickel (4n + 2) systems having a nondegenerate ground state (note that these states always belong to the A-type symmetry species) and large resonance energies. Thus, while C3H3 will readily ionize to form C3Hs +, the next member C5H5 prefers to accept an additional electron to form the anion CsHs- having the same ,r-electron sextet as benzene. Of course, as the size of the cycles increases, the stabilizing resonance energy becomes smaller and smaller (el. Sec. 4.3) thus making the facility of the positive and negative ion formation less noticeable. 5. C O N C L U S I O N S The above presented examples amply demonstrate that even the simplest version of the proposed PPP-VB scheme, employing a single parameter OEAO {bl} basis, represents a viable, very efficient and transparent method that provides highly correlated description of low lying states of ,r-electron systems. Moreover, the required {bl} basis is readily
496
Table 1 A comparison of the exact FCI total ~r-electron energies E(FCI) (in eV) with those obtained by the PPP-VB method, employing either the optimal [E(VB, e{op0)] or average [E(VB, e(~))] OEAO {bl} basis set and the minimum number g of VB structures, for the ground states of cyclic polyenes CNHN with an odd number of sites N, N = 2n + 1, and of their ions. The degeneracy (Deg.) of these states is indicated by the type of the symmetry species involved (E for doubly degenerate and A for totally symmetric nondegnerate ones). System
N
CNHN
CNH +
3
5
7
9
11
13
15
Deg. E(FCI) E(VB,e (~ E(VB,e (~))
E -4.28 -4.21 -4.12
E -8.51 -8.36 -8.36
E -12.41 -11.99 -11.99
E -16.20 -15.72 -15.72
E
E
E
-19.31 -19.31
-22.90 -22.89
-26.44
Deg.
E -3.95 -3.79 -3.79
A -10.47 -10.21 -10.21
E -12.86 -12.17 -12.17
A
E
A
E(VB,e (~ E(VB,e (~))
A -2.74 -2.74 -2.73
-17.10 -17.05
-19.56 -19.54
-23.77
Deg. E(FCI) E(VB,e (~ E(VB,e (ave))
E 1.77 1.77 1.78
A -6.64 -6.51 -6.47
E -8.61 -8.06 -8.05
A -14.26 -13.57 -13.56
E
A
E
-15.84 -15.84
-20.35 -20.30
-23.02
E(FCI) CNHN
transferable from one system to another without any appreciable loss of accuracy, so that one can safely employ the same universal value of the mixing parameter, e(ave) = 0.31, for all ~r-electron systems. At the same time, the VB wave functions employed are very compact, involving only a few covalent structures, mostly of the Kekul~ type. The key to the success of this approach in accounting for a large portion of the correlation effects lies in the use of OEAO basis sets. Indeed, it is well known that strictly localized AO basis sets yield VB wave functions (even for the ground states at the FVB level) in which the singly ionic ortho-polar structures predominate over the classically important covalent structures of the Kekul~ type [98]. This is particularly the case for truncated VB expansions: when the expansion involves only covalent and singly ionic structures, the latter ones outweigh the former ones by almost an order of magnitude, and even some meta-polar ionic structures are more important than the covalent Kekul~ structures [98]. When only covalent structures are employed, the resulting energy lies above the SCF energy [98]. Thus, the choice of an appropriately delocalized AO basis is crucial for the success and practical usefulness of the VB approach. At the same time it is essential that such a basis is nonorthogonal, since it is well known that the so-called VB approaches, employing quasilocal yet orthonormalized AO basis sets, lead to very slowly convergent VB expansions.
497 When designing computational algorithms it is thus important to suitably handle the well known N! problem arising from the orbital nonorthogonality (see, e.g. [99]). For the PPP Hamiltonian considered here, this problem can be partially avoided by employing the CAUGA formalism and by reverting to the original effective orthonormal PPP basis. Nonetheless, even more efficient algorithms could be developed by following some of the ideas that are exploited in current ab initio VB approaches. It must also be emphasized that our work focusses on ground states and low lying excited strates that involve primarily covalent structures. This is in fact the case for most existing applications of the VB method, be they at the semiempirical or ab initio level. The higher lying excited states may be of ionic character, or may at least involve a large ionic component. Such states correlate with purely ionic structures in the fully correlated limit (cf. [100]). A proper description of such states will necessitate the consideration of both covalent and ionic structures, and possibly the reoptimization of OEAO basis sets employed. We must recall here that in the MO description, we must consider doubly excited configurations in order to obtain at least a qualitatively correct ordering of these states. A thorough study of such states is thus desirable and their VB description, even at the semiempirical level, should be beneficial for a better understanding of the chemical nature of these states. ACKN OWLED G EMENTS Continued support (J.P.) by NSERC is gratefully acknowledged. REFERENCES
1. 2. 3. 4. 5. 6.
7. 8. 9.
10. 11. 12. 13. 14. 15.
W. Heitler and F. London, Z. Phys., 44 (1927) 455. S. London, Z. Phys., 46 (1928) 455; 50 (1928) 24. W. Heitlcr, Z. Phys., 47 (1928) 835. G.L. Clark, Chem. Rev., 5 (1928) 361. J.H. Van Vleck, Chem. Rev., 5 (1928) 467. H. Hellmann, Kwantowaja chimija, ONTI, Moscow, 1937, and Einfiihrung in die Quantenchemie, Franz Deuticke, Leipzig, 1937; W. Kotos, in Perspectives in Quantum Chemistry, J. Jortner and B. Pnllman (eds.), Kluwer, Dordrecht, 1989, pp. 145-159. L. Pauling, Chem. Rev., 5 (1928) 173. O. Burrau, Det. Kgl. Danske Videnskabernes Selskab. Math.-Fys. Meddelelser VII (1927) 14. K. Gavroglu and A. Simoes, in Historical Studies in the Physical and Biological Sciences, Vol. 25, Part 1 (University of California Press, Berkeley, 1994), pp. 47110. G.E. Kimball and H. Eyring, J. Amer. Chem. Soc., 54 (1932) 3876; H. Eyring and G.E. Kimball, J. Chem. Phys., 1 (1933) 239, 626. G. Ruiner, Nach. Ges. Wiss. G6ttingen, M.P. Klasse (1932) 337. H. Weyl, Nach. Ges. Wiss. GSttingen, M.P. Klasse (1930) 285; (1931) 33. G.W. Wheland, J. Chem. Phys., 3 (1935) 230. L. Pauling, J. Amer. Chem. Soc., 53 (1931) 1367, 3225; 54 (1932) 988, 3570. L. Pauling, J. Chem. Phys., 1 (1933) 362, 606 and 679; see also L. Pauling, The Nature of the Chemical Bond, Cornell University Press, Ithaca, NY, 1948.
498 16. L. Pauling and E.B. Wilson, Introduction to Quantum Mechanics with Applications to Chemistry (McGraw-Hill, New York, 1935). 17. L. Pauling, Proc. Natl. Acad. Sci.,14 (1928) 359. 18. L. Pauling, Phys. Rev., 37 (1931) 1185. 19. J.C. Slater, Phys. Rev., 37 (1931) 481. 20. J.C. Slater, Phys. Rev., 38 (1931) 1109. 21. F. Hund, Z. Phys., 51 (1928), 759; 63 (1930) 719. 22. J.E. Lennard-Jones, Trans. Faraday Soc., 25 (1929) 668. 23. G. Herzberg, Z. Phys., 57 (1929) 601. 24. R.S. MuUiken, Phys. Rev., 32 (1928) 186, 761; 33 (1929) 730; J. Chem. Phys., 1 (1933) 492; 3 (1935) 375, and loc. cir. 25. J.H. Van Vleck and A. Sherman, Rev. Mod. Phys., 7 (1935) 167. 26. J.H. Van Vleck, J. Chem. Phys., 3 (1935) 803. 27. F.W. Bobrowicz and W.A. Goddard III,in Methods of Electronic Structure Theory, H.F. Schaefer III (ed.),Plenum, New York, 1977, pp. 79-127, and loc cir. 28. J. Gerratt, Adv. At. Mol. Phys., 7 (1971), 141; J. Gerratt and M. Raimondi, Proc. Roy. Soc. London, A371 (1986) 525; D.L. Cooper, J. Gerratt, and M. Raimondi, Adv. Chem. Phys., 69 (1987) 319; idem., Int. Rev. Phys. Chem., 7 (1988) 59; idem. Top. Curt. Chem., 153 (1990) 41. See also contributions in this volume and loc. cir. 29. D.J. Klein and N. Trinajsti~ (eds.), Valence Bond Theory and Chemical Structure, Elsevier, Amsterdam, 1989. 30. R. McWeeny, in Ref. 29, pp. 13-51; idem., Methods in Molecular Quantum Mechanics, Academic, New York, 1989, Chap 7. See also contribution in this volume and loc. cir. 31. R.G. Parr, The Quantum Theory of the Molecular Electronic Structure, Benjamin, New York, 1963. 32. H. Primas, Chemistry, Quantum Mechanics and Reductionism, 2nd ed., Springer, Berlin, 1983. 33. J. Paldus, in Theoretical Chemistry: Advances and Perspectives, Vol. 2, H. Eyring and D. Henderson (eds.), Academic, New York, 1976, pp. 131-290. 34. O.J. Heilmann and E.H. Lieb, Trans. N.Y'. Acad. Sci., 33 (1971) 116. 35. A. Pell~gatti, J. (~i~.ek, and J. Paldus, Int. J. Quantum Chem., 21 (1982) 147; J. (~i~ek, R. Pauncz, and E.R. Vrscay, J. Chem. Phys., 78 (1983) 2468; J. (~i~ek, K. Hashimoto, J. Paldus, and M. Talmhashi, Israel J. Chem., 31 (1991) 423; M.D. Gould, J. Paldus, and J. (~i~ek, Int. J. Quantum Chem., 50 (1994) 207. 36. J. Paldus and M.J. Boyle, Int. J. Quantum Chem., 22 (1982) 1281. 37. See, e.g, J. Kouteck~, J. Paldus, and R. Zahradm'k, J. Chem. Phys., 36 (19(;2) 3129; J. Kouteck~, J. Paldus, and J. V~tek, Coll. Czech. Chem. Commun., 28 (1963) 1468.
38. J. Kouteck~, J. Paldus, and J. Ci~ek, J. Chem. Phys., 83. (1985) 1722 and loc. cir. 39. B. Dick, Zweiphotonenspektroskopie Dipol-Verbotener Ubergange, Ph.D. Thesis, Universitiit zu KSln, 1981. 40. G. Hohlneicher and B. Dick, J. Chem. Phys., 70 (1979) 5427; idem, Pure Appl. Chem. 55 (1983) 261. 41. L. Goodman and R.P. Rava, Acc. Chem. Res., 17 (1984) 250.
499 42. J. Michl, Tetrahedron, 40 (1984) 3845; M. Klessinger and J. Michl, Excited States and Photochemistry of Organic Molecules, VCH Publishers, New York, 1995 and loc. cir.
43. R. Pariser and R.G. Parr, J. Chem. Phys., 21 (1953) 466, 767; J.A. Pople, Trans. Faraday Sot., 49 (1953) 1375. 44. R. Pariser, J. Chem. Phys., 21 (1953) 568. 45. M. Goeppert-Mayer and A.L. Sklar, J. Chem. Phys., 6 (1938) 219. 46. N. Mataga and K. Nishimoto, Z. Phys. Chem., 13 (1957) 140. 47. J. Paldus and E. Chin, Int. J. Quantum Chem., 24 (1983) 373. 48. J. Paldus and B. Jeziorski, Theor. Chim. Acta, 73 (1988) 81. 49. R. Pariser, J. Chem. Phys., 24 (1956) 250. 50. J. C~ek, J. Paldus, and I. Huba~, Int. J. Quantum Chem., 8 (1974) 951. 51. C.A. Coulson and I. Fischer, Phil. Mag., 40 (1949) 306. 52. W.A. Goddard III, T.H. Dunning, Jr., W.J. Hunt, and P.J. Hay, Ace. Chem. Res., 6 (1973) 368; W.A. Goddard III and L.B. Harding, Ann. Rev. Phys. Chem., 29 (1978) 363 and loc. cir. 53. J. Gerratt, in Theoretical Chemistry, Vol. 4, Specialist Periodical Reports, Chemical Society, London, 1974. 54. N.C. Pyper and J. Gerratt, Proc. R. Soc. Lond., A 355 (1977) 407. 55. F.E. Penotti, Int. J. Quantum Chem., 46 (1993) 535; 59 (1996) 349. 56. J. Verbeek and J. van Lenthe, J. Mol. Struct. (Theochem), 229 (1991) 115. 57. P.B. Karadakov, J. Gerratt, D.L. Cooper, and M. Raimondi, J. Chem. Phys., 97 (1992) 7637; D.L. Cooper, J. Gerratt, M. Raimondi, M. Sironi, and T. Thorsteinsson, Theor. Chim. Acta, 85 (1993) 261. 58. R.B. Murphy and R.P. Messmer, J. Chem. Phys., 98 (1993) 7958. 59. R. McWeeny, Int. J. Quantum Chem., 34 (1988) 25. 60. X. Li and J. Paldus, J. Mol. Structure (Theochem), 229 (1991) 249. 61. J. Paldus and X. Li, Israel J. Chem., 31 (1991) 351. 62. J. Paldus and X. Li, in Group Theory in Physics, AIP Conference Proceedings No. 266, A. Frank, T.H. Seligman, and K.B. Wolf (eds.), American Institute of Physics, New York, 1992, pp. 159-178. 63. J. Paldus and X. Li, in Symmetries in Science VI: From the Rotation Group to Quantum Algebras, B. Gruber (ed.), Plenum, New York, 1993, pp. 573-592. 64. X. Li and J. Paldus, Chem. Phys., 204 (1996) 447. 65. X. Li and J. Paldus, Int. J. Quantum Chem., 60 (1996) 513. 66. X. Li and J. Paldus, Int. J. Quantum Chem., 41 (1992) 117. 67. J. Planelles, J. Paldus, and X. Li, Theor. Chim. Acta, 89 (1994) 33, 59. 68. J. Paldus and C.R. Sarma, J. Chem. Phys., 83 (1985) 5135; J. Paldus, M.-J. Gao, and J.-Q. Chen, Phys. Rev. A, 35 (1987) 3197; C.R. Sarma and J. Paldus, J. Math. Phys., 26 (1985) 1140; M.D. Gould and J. Paldus, ibid. 28 (1987) 2304. 69. J. Paldus, in Mathematical Frontiers in Computational Chemical Physics, IMA Series, Vol. 15, D.G. Truhlar (ed.), Springer-Verlag, Berlin, 1988, pp. 262-299; idem, in Contemporary Mathematics, Vol. 160, N. Kamran and P.J. Olver (eds.), American Mathematical Society, Providence, RI, 1994, pp. 209-236. 70. J. Paldus, S. Rettrup, and C.R. Sarma, J. Mol. Struct. (Theochem), 199 (1989) 85. 71. J. Paldus and J. Planelles, Theor. Chim. Acta, 89 (1994) 13. 72. G. Maier, Angew, Chem., Int. Ed. Engl., 27 (1988) 309.
500 73. D. DShnert and J. Kouteck~, J. Am. Chem. Soc., 102 (1980) 1789. 74. W.T. Borden (ed.), Diradicals, Wiley, New York, 1982. 75. Proceedings of the Symposium on Ferromagnetic and High Spin Molecular Based Materials, Mol. Cryst. Liq. Cryst. 176 (1989). 76. H.C. Longuett-Higgins, J. Chem. Phys., 18 (1950) 265. 77. A., Ovchinnikov, Theor. Chim. Acta, 47 (1978) 297. 78. L. Pauling and G.W. Wheland, J. Chem. Phys., 1 (1933) 362; G.W. Wheland, Resonance in Organic Chemistry, J. Wiley & Sons, New York, 1955. 79. S. Shaik and R. Bar, Nouv. J. Chim., 8 (1984) 411; P.C. Hiberty, S.S. Shaik, J.-M. Lefour, and G. Ohanessian, J. Org. Chem., 50 (1985) 4657; S. Shaik, P.C. I-Iiberty, G. Ohanessian, J.-M. Lefour, Nouv. J. Chim., 9 (1985) 385; idem, J. Phys. Chem., 92 (1988) 4086; S.S. Shaik and P.C. Hiberty, J. Am. Chem. Soc., 107 (1985) 3089; S.S. Shaik and M.-H. Whangbo, Inorg. Chem., 25 (1986) 1201; S.S. Shaik, P.C. I-Iiberty, J.-M. Lefour, and G. Ohauessian, J. Am. Chem. Soc., 109 (1987) 363; P.C. I-Iiberty, in Topics in Current Chemistry, Vol. 153, I. Gutman and S.J. Cyvin (eds.), Springer-Verlag, 1990, p. 27 and loc. cir. 80. N.C. Baird, J. Org. Chem., 51 (1986) 3908; J.P. Malrieu, Nouv. J. Chim., 10 (1986) 61. 81. E.D. Glendening, R. Faust, A. Streitwieser, K.P.C. Vollhardt, and F. Weinhold, J. Amer. Chem. Soc., 115 (1993) 10952. 82. P.C. Hiberty, G. Ohanessian, S.S. Shaik, and J.P. Flament, Pure Appl. Chem., 65 (1993) 35. 83. M. B6nard and J. Paldus, J. Chem. Phys., 72 (1980) 6546. 84. M. Takahashi and J. Paldus, Int. J. Quantum Chem., 28 (1985) 459 and loc. cit. 85. H. Guo and J. Paldus, Int. J. Quantum Chem., in press. 86. C.F. Fincher, C.E. Chen, A.J. Heeger, A.G. MacDiarmid, and J.B. Hastings, Phys. Rev. Left., 48 (1982) 100. 87. G. KSnig and G. Stollhoff, Phys. Rev. Left., 65 (1990) 1239. 88. S. Nikolid, M. Randi~, D.J. Klein, D. Plav~id, and N. Trinajstid, J. Mol. Struct., 198 (1989) 223; S.A. Alexander and T.G. Schmalz, J. Am. Chem. Soc., 109 (1987) 6933. 89. J. (~i~ek, J. Chem. Phys., 45 (1966) 4256; idem, Adv. Chem. Phys., 14 (1968) 35; J. (~i~ek and J. Paldus, Int. J. Quantum Chem., 5 (1971) 359. 90. R.J. Bartlett, J. Phys. Chem., 93 (1989) 1697; J. Paldus, in Methods in Computational Molecular Physics, NATO ASI Series, Series B, Vol. 293, S. Wilson and G.H.F. Diercksen (eds.), Plenum, New York, pp. 99-194; idem, in Relativistic and Correlation Effects in Molecules and Solids, NATO ASI Series, Series B, Vol. 318, G.L. Malli (ed.), Plenum, new York, 1994, pp. 207-282, and loc. cit. 91. K. Jankowski and J. Paldus, Int. J. Quantum Chem., 18 (1980) 1243; P. Piecuch, S. Zarrabian, J. Paldus, and J. (~i~ek, ibid., B 42 (1990) 3351. 92. J. Paldus, M. Takahashi, and R.W.H. Cho., Phys. Rev., B 30 (1984) 4267. 93. H. Zhi, D. Cremer, Int. J. Quantum Chem., Syrup. 25, (1991) 43; idem, Theor. Chlm. Acta, 85 (1993) 305. 94. J. Paldus, J. (~/~ek, and M. Takahashi, Phys. Rev., A30 (1984) 2193. 95. P. Piecuch, R. Tobota, and J. Paldus, Phys. Rev., A 54 (1996) 1210, and loc. cit. 96. L. St01arczyk, Chem. Phys. Left., 217 (1994) 1.
501
97. G. Peris, J. PlaneUes, and J. Paldus, Int. J. Quantum Chem., in press. 98. G.A. GaUup, Int. J. Quantum Chem., 6 (1972) 899; J.M. Norbeck and G.A. Gallup, J. Am. Chem. Soc., 100 (1973) 4460; 96 (1974) 3386; G.A. GaUup and J.M. Norbeck, ibid.,97 (1975) 970. 98. J. Verbeek and J.H. van Lenthe, J. Mol. Struct. (Theochem), 229 (1991) 115. I00. J. Paldus, J. (~ek, and I. Huba~, Int. J. Quantum Chem., Symp. 8 (1974) 293.
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
503
T h e s p i n - c o u p l e d d e s c r i p t i o n of a r o m a t i c , a n t i a r o m a t i c a n d nonaromatic systems David L. Cooper a, Joseph Gerratt b and Mario Raimondi c aDepartment of Chemisty, University of Liverpool, P.O. Box 147, Liverpool L69 3BX, United Kingdom bSchool of Chemistry, University of Bristol, Cantocks Close, Bristol BS8 1TS, United Kingdom CDipartimento di Chimica Fisica ed Elettrochimica, Universit~ di Milano, Via Golgi 19, 20133 Milano, Italy The correlated motion of ~ electrons in a range of aromatic, antiaromatic and nonaromatic systems is investigated using modern spin-coupled theory. In the particular case of benzene, the resulting wavefunction, which is superior to the simple MO description, resembles closely the classical VB picture, except for small (but crucial) deformations of the orbitals. It is argued, by reference to a multiconfiguration approach which subsumes both of the simple MO and VB models, that we should feel comfortable with switching between the two modes of description, according to which leads most directly to correct predictions for the particular problem at hand. The bonding in benzene is contrasted with that in cyclobutadiene and cyclooctatetraene, at various geometries. In the spin-coupled model, the distinguishing feature of antiaromatic species is the occurrence of triplet coupling of electron pairs. Of course, the singlet ground states of both of these molecules distort from the idealized high-symmetry geometries to nonaromatic situations with olefinic bonds. 1. I N T R O D U C T I O N
Organic chemists tend to be pragmatists when faced with rival MO and VB descriptions of molecular electronic structure. Many will use whichever model seems most convenient for the problem at hand. MO descriptions are widely employed in frontier orbital approaches, as in the Woodward-Hoffmann rules, and tend to be favoured when predicting excited states or photoelectron spectra. On the other hand, it is customary to represent reaction mechanisms in terms of resonance between classical VB structures with single, double etc. bonds (plus any unpaired electrons or lone pairs) and then to indicate by means of 'curly
504 arrows' the supposed electron reorganization. The attitude fostered by m a n y text books is that MO theory is in some sense 'more fundamental' t h a n the VB model, and some organic chemists admit to a feeling of unease when using VB-based arguments in discussions with quantum chemists. A somewhat ambivalent view of the relative merits of MO and VB approaches is certainly apparent for aromatic molecules. It is common to use interchangeably terms such as 'delocalization energy' and 'resonance energy'. Students are taught about the ~ electrons of benzene moving more-or-less independently of one another in delocalized orbitals, and the relative stabilities of 'aromatic' and 'antiaromatic' systems are typically discussed in terms of the simple Hfickel 4n+2 rule. On the other hand, the VB model proves to be exceptionally useful when predicting the outcome of electrophilic substitution reactions, for example. Considerations of resonance between Kekul~ (and other) structures and ideas as to the relative energies of Wheland intermediates typically lead to very straightforward predictions of correct products. The purpose of the present account is to discuss some of the key findings for ~-electron ring systems, of modern valence bond theory, in its spin-coupled form. The approach incorporates from the outset the chemically most significant effects of electron correlation, but it retains a simple, clear-cut visuality. Spin-coupled calculations have been performed for a wide range of ~-electron aromatic, antiaromatic and nonaromatic ring systems. Molecules studied include benzene [1-10], cyclobutadiene [11], cyclooctatetraene [12], six-membered ring heterocycles [3,13] and five-membered ring heterocycles [14-16], o-benzyne [17], fused ring systems [18,19,20], various inorganic molecules [4,21,22], rings with methylene substituents [11,18,23], oligomers [16,18,23], and certain unusual species, such as bicyclic 1,6-methano[10]annulene, which features a nonplanar aromatic system [24]. In addition to studying ground states, we have carried out calculations on excited states, of ionization potentials, and for electrophilic substitution reactions. Reviews of our earliest work on benzenoid aromatic molecules and on antiaromatic systems are available elsewhere [25,26]. For each system, we take account of electron correlation for the ~ electrons, but not for the (~ framework. However, the ~ orbitals for aromatic systems lie to a considerable extent within the space of the o orbitals and not well outside it, as is commonly assumed. This might seem to bring into question the fundamental concept of o-~ separation. However, the ~ orbitals are a great deal more polarizable than the ~ orbitals, and so it can be argued that the ~ system provides a large proportion of the response of the system to chemical and other influences. As a result, much of the chemistry of aromatic systems can be understood by considering only the ~ electrons. An important outcome of all these spin-coupled calculations is the consistency of the descriptions. In particular, a simple and highly-visual model emerges for the behaviour of c o r r e l a t e d ~ electrons in all of the aromatic molecules that we have studied. These ~-electron systems are well described in terms of fairly localized, nonorthogonal, singly-occupied orbitals. The special stability of such systems arises in the spin-coupled model from a profoundly q u a n t u m mechanical
505 phenomenon, namely the mode of coupling of the electron spins, as is shown by the magnitudes of the resonance energies, and not from any supposed delocalization of the orbitals. The descriptions of these systems are, to all intents and purposes, unaltered by the inclusion of ionic structures or of additional electron correlation into the wavefunction. It is important to dispense with the received wisdom t h a t MO theory is in some sense more fundamental than VB approaches. On the other hand, it is certainly not our intention to argue that the MO description is somehow 'wrong'. In the particular case of benzene, we quantify to what extent the conventional MO and VB models can be considered reliable approximate representations of a particular type of multiconfigurational wavefunction that is more sophisticated t h a n those obtained from either approach. We conclude that we should not have any serious qualms about switching between the MO and VB representations, according to the nature of the particular problem being addressed. 2. S P I N - C O U P L E D W A V E F U N C T I O N S
The ab initio spin-coupled wavefunction for an N-electron system is based on a single product of singly-occupied orbitals, r162 In general, these orbitals are all distinct and nonorthogonal. The spin-coupled orbitals are expanded in an atomic basis set, much as in MO theory. This orbital product is combined with an N-electron spin function | corresponding to total spin S and projection M, and the configuration is antisymmetrized. The total spin function is expanded in the full spin space and it is fully optimized variationally simultaneously with the spin-coupled orbitals. This is usually done without the imposition of any constraints that would alter the total wavefunction. The spin-coupled wavefunction incorporates from the outset the chemically most significant effects of (nondynamical) electron correlation and describes properly all dissociation pathways. J u s t as in correlated descriptions based on MO theory, it is neither practical nor desirable to include electron correlation for all of the electrons in large systems. In common with other strategies, the orbital space is partitioned into 'inactive', 'active' and unoccupied or 'virtual' subspaces. Electron correlation is incorporated only for the 'active' space, which corresponds to that part of the electronic structure which interests us most. A convenient representation of a general spin-coupled wavefunction takes the form:
WSC
=
A{(P~(p22~ ~"(P,21,O~;~)r162162
N OSM}
(1)
in which A is the antisymmetrizer. The spin function O2p~represents perfect pairing of the spins of the 2n inactive electrons, which are accommodated in the n doubly-occupied orbitals ~,. Conditions that do not alter the total wavefunction are normalization of the k~, and r orthogonalization of the ~i amongst themselves, and orthogonalization of the % to the r We may reorder the spin-
506 coupled active orbitals, as long as we also make the corresponding changes to the total spin function. On the other hand, the wavefunction is not invariant to linear transformation of the r so that the form of the spin-coupled solution represents a unique outcome of the variational procedure for the given choice of active space. The spin-coupled model represents the proper generalization to many-electron systems of the early VB work of Heitler and London [27], and of Coulson and Fischer [28]. A h a l l m a r k of the spin-coupled approach is the direct optimization of nonorthogonal orbitals without preconceptions as to their form or degree of localization, and without constraints on the overlaps between them or on the associated mode of coupling of the electron spins. A key feature is the expansion of the total spin function for the active electrons as a linear combination of all allowed modes of spin coupling. There are in fact many different, useful ways of constructing complete sets of linearly independent N-electron spin functions, as is described, for example, in the book by Pauncz [29]. Spin bases t h a t prove especially convenient for carrying out spin-coupled calculations include the Kotani and Rumer schemes, although we have recently explored also the increased efficiencies that may be achieved with character-projected spin functions. The Serber basis also proves to be very useful as an interpretational tool. It is important to stress that provided we use the full spin space, we may transform easily between these various representations [30]. We may also choose to reorder the orbitals, if this aids interpretation. In essence, the Kotani scheme corresponds to the successive coupling of the spins of individual electrons, according to the usual rules for combining angular momenta. The Rumer scheme, which was widely used in classical VB treatments, corresponds instead to coupling (in arbitrary order) singlet-coupled pairs of electrons and any unpaired spins. The Serber scheme involves the successive coupling of singlet- or triplet coupled pairs of electrons. The Rumer basis turns out to be particularly useful for interpreting the total spin functions for aromatic systems. In the case of N=6 and S=0, there are just five linearly independent modes of spin coupling [29], which may be represented as in Figure 1, in which an arrow i-~j signifies a factor in the total spin function of 2-~((z(i)~(j)-(~(j')~(i)). The similarity to Kekul~ and para-bonded structures for benzene is obvious. The spin-coupled approach, which combines an accurate description of molecular electronic structure with a highly visual picture of the bonding, has now been applied to a very wide range of problems and numerous review articles are available [25,26,31-34], including details of the various computational algorithms. A subsequent chapter is concerned with the n a t u r e of the hypercoordinate bonding to main group elements, as in molecules such as PF 5 and SF~. The spin-coupled description may be refined further, without altering the key features, by means of the incorporation of additional configurations in a nonorthogonal configuration interaction calculation, which we term the spincoupled valence bond or SCVB method. This leads to very high accuracy for
507 ground and excited states, while retaining a compactness t h a t aids u n a m b i g u o u s interpretation.
1 6
1 2
5
1
6
2
6
5
3
5
1 2
1
6
2
5
6
2
5
4
4
4
4
4
R1
R~
R3
R4
R5
Figure 1. Rumer diagrams for N=6 and S=O.
3. B E N Z E N E We have recently carried out new calculations for benzene [8,10], at the geometry shown in Figure 2. All the calculations were performed using MOLPRO [35] or our own codes, as appropriate. The s and p basis functions for C and the s basis functions for H were taken from correlation consistent pVTZ basis sets [36], and these were augmented with polarization functions with exponents de=0.8 and p.=l.0, so that the C/H basis consists of (10s5pld/5slp) Cartesian Gaussians contracted to [4s3pld/3slp]. The MO energy level diagram is shown in Figure 3 with arrows $$ to denote the occupancy in the restricted Hartree-Fock (RHF) configuration, a22,el4.
b2g 139.64 pm H6
108.31pm
1~ /H2
H5/C5~
3\H3
0.10
0.04
e2,,
0.10
,90
,90
~
1.96
H4
a2u
Figure 2. Geometry and atom labelling for benzene (D,~,,).
Figure 3. MO energy level diagram for benzene, annotated in various ways.
508 A '6 in 6' CASSCF calculation [37] was performed for the ~ electrons, keeping the thirty-six • electrons in an (optimized) closed-shell core. Unlike the spincoupled model, this multiconfigurational description involves all possible distributions of the six x electrons in six ~ orbitals. The nondynamical correlation energy retrieved by this wavefunction amounts to -73 millihartree (see Table 1). The n u m b e r s shown alongside each level in Figure 3 are the CASSCF n a t u r a l orbital occupation numbers, some of which show significant deviations from the R H F values of 2 and 0. The weight of the configuration a~e~ in the full CASSCF wavefunction is 88.4%. By definition, the RHF calculation retrieves none of the correlation energy incorporated in the CASSCF calculation.
Table 1 Energies calculated for benzene. Calculation CASSCF SC SC (f.c.) RHF
E/hartree
(E-Ec~)/millihartree
-230.8368216 -230.8293315 -230.8293314 -230.7640556
0 7.49 7.49 72.77
Proportion of
(Ec~-ERHF)
100% 89.7% 89.7% 0%
In common with all other wavefunctions of 'full CI' form, CASSCF wavefunctions are i n v a r i a n t to general nonsingular linear t r a n s f o r m a t i o n of the active orbitals, including n o n u n i t a r y transformations t h a t result in nonorthogonal orbitals [8,10,38,39]. As an alternative to, say, the n a t u r a l orbital representation, we m a y exploit this invariance to generate a r e p r e s e n t a t i o n of the CASSCF wavefunction in which the d o m i n a n t component takes spin-coupled form. We m u s t stress t h a t the wavefunction and the total energy are not changed in a n y w a y by this procedure. It proves straightforward to find an alternative r e p r e s e n t a t i o n of the full CASSCF wavefunction in which a spin-coupled-like component has an overlap with the total wavefunction t h a t exceeds 0.995. The form of the orbitals and the mode of spin coupling [8,10] are very similar indeed to those t h a t we describe later, based on fully-variational spin-coupled calculations. Indeed, this d o m i n a n t spin-coupled-like component has an energy expectation value which lies within 0.1 millihartee [8] of the fully-variational spin-coupled result. Spin-coupled calculations were carried out with an active space corresponding to the six ~ electrons and a frozen core t a k e n directly from the CASSCF calculation. The resulting total energy, labelled SC(f.c.) in Table 1, retrieves 89.7% of the correlation energy incorporated in the CASSCF calculation. A fullyvariational calculation (labelled SC in Table 1), in which we optimize also all of the doubly-occupied (~ core orbitals, gives a further energy i m p r o v e m e n t on the order of only 0.1 microhartree. We find six s y m m e t r y - e q u i v a l e n t C(2p~) orbitals, each associated with a given carbon atom, but exhibiting some deformation
509 towards the neighbouring C atoms on each side, as is shown in Figure 4. These distortions are no larger than those that we have seen for the ~ bonds in alkenes. Numbering the orbitals according to the C atoms with which they are associated, the symmetry-unique overlap integrals are (~1]~2)=0.524, (~1]~3)=0.029 and (q~, I (~4)=-0.157.
Figure 4. Various representations of spin-coupled orbital r for benzene. Left: contours in the horizontal plane l b o h r above the molecular plane. Centre: contours in a vertical mirror plane. Right: a representative isosurface (3-D contour). In the Rumer basis [29] (see Figure 1), the total spin function corresponds to weights of 40.6% each for the two Kekul~ structures (R1,R4) and of 6.3% each for the three para-bonded ('Dewar') structures (R2,R3,R~). These values are very close to those given many years ago by Pauling [40] in his original, and much simplified, classical VB calculation, and discussed by Coulson [41] in his text book. The difference in energy between one of the Kekul~-type structures (R 1) and the full spin-coupled wavefunction (with the full spin space) can reasonably be termed the resonance energy. Using this procedure, we obtain a value of 83.5 k J mo1-1. The remaining -10% of the nondynamical correlation energy not recovered by the spin-coupled wavefunction is due primarily to the omission of doubly-ionic spin-coupled configurations. The contributions from singly-ionic configurations are smaller, as is to be expected for wavefunctions based on fully-optimized nonorthogonal orbitals. Of the various singly-ionic spin-coupled configurations, those with charges in meta positions are the most important [10]. The spin-coupled-like component of the CASSCF wavefunction, the SC(f.c.) wavefunction and the fully-variational spin-coupled wavefunction, which are all exceedingly similar to one another, vindicate the familiar description of benzene in terms of two Kekul~ and three para-bonded structures. A crucial difference,
510 however, is the small degree of deformation of the orbitals towards neighbouring atoms. One consequence of these deformations is t h a t the addition of 'ionic configurations', in which one or more of the valence orbitals is doubly-occupied, leads only to a very modest further improvement in the wavefunction. On the other h a n d , if one insists on strictly localized orbitals, as in classical valence bond theory, then the weights of the covalent structures decrease dramatically, at the expense of significant contributions from the plethora of possible ionic structures [42,43]. Similarly, in an exact representation of a '6 in 6' CASSCF wavefunction in t e r m s of orthogonal localized molecular orbitals, Hirao et al. [44] found weights for the Kekul6 structures of 7.8% each and for the para-bonded structures of 2.6% each, so t h a t 76.6% of the benzene ground state wavefunction is a p p a r e n t l y ionic! All of these observations reinforce our preference for the spin-coupled description of benzene. One of the great strengths of an RHF mode of description is of course the ease with which one m a y obtain first estimates of the relative energies of excited states. In the particular case of benzene, a variety of low-lying states arise from 2 3 2 3 configurations such as a2 ele2, and a2 e,gb2g. However, it is difficult to see a p r i o r i w h y some of these should be covalent valence states, w h e r e a s others are p r e d o m i n a n t l y ionic in character and some low-lying states are Rydberg states. The spin-coupled valence bond (SCVB) method has been used to study all the singlet and triplet valence states and n=3,4 Rydberg states of benzene below the first ionization limit [5]. The numerical accuracy provided by these very compact wavefunctions compares very favourably indeed with the most extensive correlated M O - C I calculations in the literature. We find t h a t '~-only' correlation affords an excellent description of the covalent valence states. The s a m e is true of the Rydberg states, provided t h a t the a core is derived from a calculation on the cation. On the other hand, a proper description of the ionic states requires some account to be t a k e n of g - ~ correlation effects. The accuracy to which covalent states of benzene can be described w i t h ~-only correlation is a further justification for invoking (~-~ separation for the various aromatic, a n t i a r o m a t i c and nonaromatic ring systems t h a t we consider here. We have not addressed the question of w h e t h e r it is primarily the a electrons or the electrons of benzene t h a t drive the preference for a high s y m m e t r y structure [45]. Similarly, we do not investigate here a n y 'bent bond' solutions, based on mixing g and = orbitals [46]. L The spin-coupled descriptions of n a p h t h a l e n e and azulene resemble those for benzene except, of course, t h a t orbitals associated with bridging carbon atoms now show a three-way distortion [18,19]. Analogous descriptions arise also for heterocycles with five- and six-membered rings [3,13-16]; w h e n there are two spin-coupled n orbitals for a given heteroatom, one of t h e m adopts a tightly localized form w h e r e a s the other m a y exhibit significant delocalization onto the neighbouring atoms in the ring. The weights of the different modes of spin coupling, and the computed resonance energies, are consistent with the traditional organic chemistry views of these systems. Simple m o d e r n VB e s t i m a t e s of the ionization potentials [14,21] are at least as good for the lowest
511 states of the ion as those derived from Koopmans' theorem, while the higher ones appear to be considerably more reliable. In contrast to the heterocyclic systems, we have found that various inorganic systems, such as borazine, boroxine, N2S2, and perfluorocyclophosphazenes, are much closer to being zwitterionic species than they are inorganic analogues of benzene [4,21,22]. Spin-coupled studies of Wheland intermediates formed by ring protonation of benzene, phenol and benzonitrile provide ab initio support [7] for the usual qualitative VB arguments used to discuss the energetics and selectivity of aromatic electrophilic substitution reactions. Analogous calculations have also been performed for the reaction between benzene and a methyl cation [9]. 4. C Y C L O B U T A D I E N E
A classic example of an antiaromatic system is of course square-planar cyclobutadiene (D4h), for which the ~-electron MO energy level diagram is shown in Figure 5. The RHF configuration a~ue~ implies states of 1A~, 3A2g, B~ and ~Bu symmetries. According to Hund's rules, the triplet state (3Au) ought to lie lowest. For this state, the square-planar geometry is stable with respect to geometric distortions, but it turns out that this is not the ground state. At this geometry, the state of 1B2g symmetry lies lower by more than 40 kJ mo1-1. Furthermore, this singlet ground state is subject to a second-order Jahn-Teller distortion, such that the equilibrium geometry for cyclobutadiene is based on a rectangular ring with two shorter C=C double bonds and two longer ~ single bonds. The reasons for the breakdown of Hund's rule for some systems with four-fold symmetry, such as square-planar cyclobutadiene, have been explored by various authors [47,48].
b2u
eg
a2u
Figure 5. MO energy level diagram for square-planar (D4h) cyclobutadiene. Straightforward applications of the spin-coupled approach confirm the relative energies and the preferred geometries of cyclobutadiene in these singlet and triplet states [11], without the requirement for multiconfigurational descriptions,
512 as in the analogous MO treatments. Of greater significance for the present account is the physical picture of the electronic structure and bonding revealed by these calculations. The aromaticity of benzene is linked, in spin-coupled theory, to the particular mode of coupling of the electron spins, and so it seems reasonable to suppose t h a t the orbital descriptions of D4/' cyclobutadiene and of D6h benzene could be fairly similar, but for these to be associated with very different modes of spin coupling. To a first approximation, this indeed turns out to be the case. With benzene-like orbitals ordered a,b,c,d around the ring, the symmetry requirements of an overall 1B~ state are such that the electron spins associated with each diagonal (a/c and b/d) must be strictly triplet coupled. These two triplet subsystems combine to a net singlet. A characteristic feature of antiaromatic situations in spin-coupled theory is the presence of such triplet-coupled pairs of electrons.
Figure 6. Symmetry-unique spin-coupled orbitals for the 1B1" state of square-planar C4H4. Those for the :~A2~ ~ excited state are very similar indeed. Contours are shown in the horizontal plane l b o h r above the molecular plane. This is not quite the end of this particular story. In the very special case of electron spins which are strictly triplet coupled, the spin-coupled wavefunction is not altered by taking the sums and differences of the corresponding orbitals: (a+c), (a-c), (b+d) and (b-d), neglecting normalization. The term 'antipairs' has been coined for such an alternative mode of description. It turns out that this orbital representation already corresponds to the required 1B~ symmetry at the square-planar geometry, so that there are now more free parameters in the spin space: the relevant spins need no longer be exactly triplet coupled. We find that
513 the variational calculations for cyclobutadiene exploit this extra degree of freedom, so that the converged spin-coupled solutions for square-planar geometry adopt this antipair form [11], as is shown in Figure 6. The orbitals associated with a given diagonal remain close to a+c and a - c (neglecting normalization), and the relevant electrons spins are almost exactly triplet coupled. The difference in energy between the antipair and localized orbital descriptions is exceedingly small and we may choose to regard them as almost equivalent. The key feature, in either mode of description, is the presence of triplet-coupled pairs of electrons. As the molecule distorts towards its equilibrium geometry, the spin-coupled orbitals rapidly adopt the characteristic shapes and associated pattern of spin coupling expected for two separate C=C double bonds. Starting from square-planar cyclobutadiene, the formal replacement with CH 2 groups of H atoms on opposing diagonals leads to DMCB (see Figure 7), with six electrons. We find that the spin-coupled description of the 3B2u ground state possesses one antipair [11], across the diagonal labelled a/b. The dominance of the relevant mode of spin coupling can be seen most easily by expressing the total spin function in the Serber basis. However, there is also significant triplet character in the two exocyclic C=C bonds, consistent with experimental EPR measurements. The lowest singlet state of DMCB is found to possess lAg symmetry [11], rather than the 1B2u symmetry expected from Htickel theory, but the planar geometry is no more than a transition state for the formation of a nonplanar bicyclic system via the development of a long transannular bond. The coupling together of multiple DMCB triplet units [18], as in BBB (see Figure 8), which features antipairs across the diagonals a/b and a'/b" in separate rings, opens up interesting possibilities for antiferromagnetic polymers.
a
a
a'
b
b
b'
Figure 7. DMCB or 2,4-dimethylenecyclobutane-l,3-diyl.
Figure 8. BBB or bismethylenebiscyclobutylidene.
We have also studied the bicyclic planar structure, with eight ~ electrons, that arises formally from the fusion of benzene and cyclobutadiene rings. Our interest here was in determining the extent to which the particular electronic structure features of the two separate rings might persist. We found [20] that the aromaticity of the distorted benzene ring prevails in the singlet ground state of benzocyclobutadiene, with an essentially isolated double bond in the smaller ring.
514 5. CYCLOOCTATETRAENE Spin-coupled calculations at the idealized Dsh geometry of cyclooctatetraene reveal a description dominated by triplet coupling of pairs of electrons [12], as anticipated earlier. Expressing the total spin function in the Serber basis [29], we find that the mode made up only of triplet-coupled pairs is responsible for 75% of the total. We find that the x orbitals for this antiaromatic system (see Figure 9) adopt localized forms that resemble closely those shown in Figure 4 for benzene, rather than the antipair representation shown for cyclobutadiene in Figure 6. On distorting the regular octagon to another idealized geometry, with alternating shorter and longer sides (D4h symmetry), the triplet-coupled pairs disappear, and we observe instead orbitals and a mode of spin coupling that is characteristic of an alkene (nonaromatic system). The equilibrium geometry for cyclooctatetraene is in fact nonplanar, namely a tub structure (D2d). The spincoupled description of the x-like electrons in this nonaromatic system (see Figure 9) corresponds to four essentially localized olefinic bonds [12].
Figure 9. Symmetry-unique spin-coupled orbitals for cyclooctatetraene. Left: idealized Dsh geometry (contours in the horizontal plane lbohr above the molecular plane). Right: analogous representation for the nonplanar equilibrium geometry (Dz~).
6. CONCLUSIONS Aromatic molecules play a central role in organic chemistry and, although a somewhat fuzzy concept that eludes definition in terms of clear-cut experimental
515 and/or theoretical criteria [49], aromaticity certainly continues to be of great utility at a qualitative level. We have concentrated in the present account on the archetypal aromatic system, benzene, and considered also two classic examples of antiaromatic/nonaromatic systems, namely cyclobutadiene and cyclooctatetraene. Analysis of a CASSCF description of benzene that takes account of (nondynamical) electron correlation for the ~-electron system proves to be especially informative. Simple MO theory, in the form of the RHF configuration 2 4 a2ue ~, accounts for 88.4% of this wavefunction and (by definition) recovers none of the nondynamical correlation. A modern valence bond representation, in the form of a spin-coupled-like component, has an overlap with this wavefunction in excess of 0.995 and it recovers 89.7% of the nondynamical correlation energy. It is clear also from the corresponding fully-variational spin-coupled calculations that the spin-coupled wavefunction is numerically superior to the standard RHF description. On the other hand, the magnitudes of these various numbers suggest that we should not discard the MO description. Instead, we may confidently make predictions using either model, depending on which one appears to be more convenient. We should, of course, be very wary of any situation for which the two models appear to lead to conflicting conclusions. The ab initio spin-coupled description of the correlated ~-electron system in benzene corresponds directly to resonance between Kekul~ and para-bonded structures, built from a single product of nonorthogonal orbitals. These spincoupled orbitals resemble closely those in the covalent-only classical VB model except for small, but crucial, deformations of the C(2p~) functions. This vindicates the continued use of such descriptions in organic chemistry, but we should bear in mind that it is the small degree of delocalization onto neighbouring centres that precludes the necessity of incorporating further (ionic) configurations. Analogous descriptions arise for a wide range of other aromatic species. It is now well established that the simple Htickel 4n+2 rule exaggerates the differences between 'aromatic' and 'antiaromatic' ~-electron systems. At the relevant idealized geometries, the characteristic feature of the spin-coupled description of antiaromatic molecules is the occurrence of essentially tripletcoupled pairs of electrons, whether in antipair or localized orbital form. It is this simultaneous unfavourable coupling of the electron spins, which is suggestive of diradical character, that discourages bonding interactions. Of course, the actual singlet ground state geometries of cyclobutadiene and cyclooctatetraene correspond to nonaromatic situations, with olefinic bonds. However, we have found that antipairs do persist at the equilibrium geometries of a number of systems. For a wide range of aromatic, antiaromatic and nonaromatic systems [1-26], the spin-coupled model provides highly visual, but accurate, descriptions of the motion of correlated ~ electrons in terms of nonorthogonal orbitals and the dominance of particular patterns of spin coupling. A striking feature is the simplicity and consistency of the descriptions that emerge.
516 REFERENCES
1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22
D.L. Cooper, J. Gerratt, and M. Raimondi, Nature 323 (1986) 699. J. Gerratt, Chem. in Brit. 23 (1987) 327. D.L. Cooper, S.C. Wright, J. Gerratt, and M. Raimondi, J. Chem. Soc. Perkin Trans. 2 (1989) 255. D.L. Cooper, S.C. Wright, J. Gerratt, P.A. Hyams, and M. Raimondi, J. Chem. Soc. Perkin Trans. 2 (1989) 719. E.C. da Silva, J. Gerratt, D.L. Cooper, and M. Raimondi, J. Chem. Phys. 101 (1994) 3866. G. Raos, J. Gerratt, D.L. Cooper, and M. Raimondi, Chem. Phys. 186 (1994) 233. G. Raos, J. Gerratt, P.B. Karadakov, D.L. Cooper, and M. Raimondi, J. Chem. Soc. Faraday Trans. 91 (1995) 4011. T. Thorsteinsson, D.L. Cooper, J. Gerratt, and M. Raimondi, Theor. Chim. Acta 95 (1997) 131. G. Raos, L. Astorri, M. Raimondi, D.L. Cooper, J. Gerratt, and P.B. Karadakov, J. Phys. Chem. A, in press. D.L. Cooper, T. Thorsteinsson, J. Gerratt, and M. Raimondi, Int. J. Quant. Chem., in press (Proceedings of the 37th Sanibel Symposium). (a) S.C. Wright, D.L. Cooper, J. Gerratt, and M. Raimondi, J. Chem. Soc. Chem. Comm. (1989) 1489. (b) ibid, J. Phys. Chem. 96 (1992) 7943. P.B. Karadakov, J. Gerratt, D.L. Cooper, and M. Raimondi, J. Phys. Chem. 99 (1995) 10186. P.B. Karadakov, M. Ellis, J. Gerratt, D.L. Cooper, and M. Raimondi, Int. J. Quant. Chem., in press. D.L. Cooper, S.C. Wright, J. Gerratt, and M. Raimondi, J. Chem. Soc. Perkin Trans. 2 (1989) 263. P.C.H. Mitchell, G.M. Raos, P.B. Karadakov, J. Gerratt, and D.L. Cooper, J. Chem. Soc. Faraday Trans. 91 (1995) 749. M. Sironi, A. Forni, M. Raimondi, D.L. Cooper, and J. Gerratt, to be published. P.B. Karadakov, J. Gerratt, G. Raos, D.L. Cooper, and M. Raimondi, Isr. J. Chem. 33 (1993) 253. G. Raos, J. Gerratt, D.L. Cooper, and M. Raimondi, Chem. Phys. 186 (1994) 251. M. Sironi, D.L. Cooper, M. Raimondi, and J. Gerratt, J. Chem. Soc. Chem. Comm. (1989) 675. P.B. Karadakov, J. Gerratt, D.L. Cooper, M. Raimondi, and M. Sironi, Int. J. Quant. Chem. 60 (1996) 545. M. Raimondi, M. Sironi, J. Gerratt, and D.L. Cooper, to be published. J. Gerratt, S.J. McNicholas, P.B. Karadakov, M. Sironi, M. Raimondi, and D.L. Cooper, J. Am. Chem. Soc. 118 (1996) 6742.
517 23 24 25 26 27 28 29 30 31 32 33
34 35
36 37 38 39 40 41 42
43 44
G. Raos, J. Gerratt, D.L. Cooper, and M. Raimondi, Mol. Phys. 79 (1993) 197. M. Sironi, M. Raimondi, D.L. Cooper, and J. Gerratt, J. Mol. Struct. (THEOCHEM) 338 (1995) 257. D.L. Cooper, J. Gerratt, and M. Raimondi, Top. in Curr. Chem. 153 (1990) 41. D.L. Cooper, J. Gerratt, and M. Raimondi, Chem. Rev. 91 (1991) 929. W. Heitler and F. London, Z. Phys. 44 (1927) 455. C.A. Coulson and I. Fischer, Phil. Mag. 40 (1949) 386. R. Pauncz, Spin eigenfunctions: construction and use, Plenum, New York, 1979. P.B. Karadakov, J. Gerratt, D.L. Cooper, and M. Raimondi, Theor. Chim. Acta 90 (1995) 51. D.L. Cooper, J. Gerratt, and M. Raimondi, Adv. in Chem. Phys. 69 (1987) 319. D.L. Cooper, J. Gerratt, and M. Raimondi, Int. Rev. Phys. Chem. 7 (1988) 59. J. Gerratt, D.L. Cooper, and M. Raimondi, in Valence bond theory and chemical structure, ed. D.J. Klein and N. Trinajsti6, Elsevier, Amsterdam, 1990; pages 287-349. J. Gerratt, D.L. Cooper, P.B. Karadakov, and M. Raimondi, Chem. Soc. Rev., in press. MOLPRO is a package of ab initio programs written by H.-J. Werner and P.J. Knowles, with contributions from J. AlmlSf, R.D. Amos, A. Berning, M.J.O. Deegan, F. Eckert, S.T. Elbert, C. Hampel, R. Lindh, W. Meyer, A. Nicklafi, K. Peterson, R. Pitzer, A.J. Stone, P.R. Taylor, M.E. Mura, P. Pulay, M. Schlitz, H. Stoll, T. Thorsteinsson, and D.L. Cooper. T.H. Dunning Jr., J. Chem. Phys. 90 (1989) 1007; the pVTZ basis was taken directly from the MOLPRO library. (a) H.-J. Werner and P.J. Knowles, J. Chem. Phys. 82 (1985) 5053. (b) P.J. Knowles and H.-J. Werner, Chem. Phys. Lett. 115 (1985) 259. T. Thorsteinsson, D.L. Cooper, J. Gerratt, P.B. Karadakov, and M. Raimondi, Theor. Chim. Acta 93 (1996) 343. T. Thorsteinsson and D.L. Cooper, Theor. Chim. Acta 94 (1996) 233. L. Pauling, J. Chem. Phys. 1 (1933) 280. C.A. Coulson, Valence, 2nd. edition, Clarendon, Oxford, 1961; chapter 9. (a) J.M. Norbeck and G.A. Gallup, J. Am. Chem. Soc. 95 (1973) 4460. (b) ibid, J. Am. Chem. Soc. 96 (1974) 3386. (c) G.A. Gallup and J.M. Norbeck, J. Am. Chem. Soc. 97 (1975) 970. G.F. Tantardini, M. Raimondi, and M. Simonetta, J. Am. Chem. Soc. 99 (1977) 2913. K. Hirao, H. Nakano, K. Nakayama, and M. Dupuis, J. Chem. Phys. 105 (1996) 9227.
518 45 46 47 48 49
P.C. Hiberty, D. Danovich, A. Shurki, and S. Shaik, J. Am. Chem. Soc. 117 (1995) 7760. P. Schultz and R.P. Messmer, Phys. Rev. Lett. 58 (1987) 2416. H. Kollmar and V. Staemmler, Theor. Chim. Acta 48 (1978) 223. G.A. Gallup, J. Chem. Phys. 86 (1987) 4018. P.J. Garratt, Aromaticity, John Wiley, New York, 1986; chapter 11.
Z.B. Maksi4 and W.J. Orville-Thomas (Editors)
519
Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Aromaticity and its Chemical Manifestations Kenneth B. Wiberg Department of Chemistry, Yale University, N e w Haven, CT 06520 1. HISTORICAL PRELUDE In the 18th century, a number of naturally occurring compounds were isolated and described as "aromatic" because of their distinctive odor. 1 When the structural theory of organic chemistry was developed in the 19th century, it became apparent that most of these compounds were benzene derivatives. As a result, they became k n o w n as aromatic compounds, in contrast to aliphatic compounds. The benzene derivatives presented an enigma to structural chemists in that although the benzene rings had three double bonds, they underwent substitution rather than addition when treated with reagents such as bromine and nitric acid. No adequate explanation for their behavior was presented prior to the d e v e l o p m e n t of q u a n t u m mechanics. In the early 1930's, two explanations were presented. One was by Pauling making use of valence bond theory, 2 and the other was by E. H/ickel making use of molecular orbital theory. 3 Pauling's use of valence bond theory had a direct connection with the types of structures commonly used by organic chemists, and was relatively easy to understand, provided one did not delve too deeply into its details. The basic postulate was that compounds having n-electron systems that can be described by more than one structure will be stabilized by "resonance" and will have a lower energy than any of the contributing structures. Thus, for benzene one would write
()
--O'-- (D--"@'--"@
where the more important structures are the first two because they have a lower energy than the "Dewar" structures. This type of formulation was easily extended to reactions. 4 Thus, considering the electrophilic substitution of anisole, one might write: +
+
G; "-- C;
+
520 and the donation of n-electron density to the o and p positions would account for the preference for o,p substitution. Of course, it is now recognized that this type of stabilization would by itself lead to reduced reactivity, s and it is the corresponding interactions in the transition state that lead to the observed reactivity and position of substitution. The alternative was molecular orbital theory which was first applied to benzene by HLickel. Here, the energy levels were found to be -2~
13
la
,,,,
The total n-energy is 4xJ3 + 2x(2~) or 8 13, whereas that of three ordinary double bonds would be 3x2J3 or 613. Thus, benzene is stabilized by 213. This model was not cast in the terms usually used by organic chemists, and thus its adoption required considerable time. However, it led in an obvious way to the 4n+2 rule for aromaticity which has been a mainstay of organic chemists interested in the subject. It led to the prediction that the following ions would be stabilized in a fashion similar to that for benzene.
All of them are now known to be unusually stable species. 6 On the other hand, the following would be expected to have little (if any) stabilization and high reactivity because of the half-filled non-bonded orbitals.
,, ,,
(=')
In the case of the square conformation of cyclobutadiene, the energy levels would be -2~ 0
13
x
;
....
A
A
521 Here, two electrons go into the doubly degenerate non-bonding level leading to an unfilled level and high reactivity. The Hiickel n-energy w o u l d be 2x2~ or 4~, the same as that for two unconjugated double bonds. Hence the prediction that it will not be stabilized. It is k n o w n that cyclobutadiene is highly reactive and that the preferred conformation is rectangular in order to break the degeneracy of the highest occupied molecular orbitals. 7 Cyclopentadienyl cation has not been prepared despite many studies, 8 and cyclooctatetraene adopts a "tub" conformation in order to minimize interaction between the ~-orbitals. 9 Valence bond theory, in the terms defined by Pauling, is not able to account for the 4n+2 rule, and the properties o f cyclobutadiene and cyclooctatetraene. It has been suggested that the problem with these molecules is the strain associated with the bond angles in the planar structures, m However, this was shown to be incorrect by the observation that the addition of two electrons to cydooctatetraene leads to the planar dianion. It is only recently that it has been recognized that cyclic permutations must be included in order to properly treat cyclic systems via valence bond theory. 1~ One of Pauling's few failures in structural theory is his nonrecognition of the problems associated with the 4n molecules. 2. VALENCE BOND VS MOLECULAR ORBITAL THEORY: In 1927, Heitler and London carried out a calculation for the hydrogen molecule using what has become known as valence bond theory. ~2 Each electron of the pair could be assigned to nuclei corresponding to wavefunctions of the type W1 = (~a(1)~)b(2) and 92 = (~a(2)(~b(1) In the first, electron 1 is in the ls atomic orbital ~)a and electron 2 is in (~b. In the second, the electrons are reversed. The two wave function when considered separately lead to the same energy. If we wish to form the hydrogen molecule, it is necessary to take linear combinations of the above, and the two new wavefunctions become (neglecting the normalization constant): V + = ~a(1)~b(2) + ~a(2)~b(1) V- = ~a(1)~b(2) - ~a(2)~b(I)
where the first is for the bonding state and the second is for the antibonding state. The binding energy of a hydrogen molecule calculated using the bonding wave function is 86.7 kcal/mol with a bond length of 0.743/k (the experimental values are 109.5 kcal/mol and a bond length of 0.740A). ~a
522 In molecular orbital theory, molecular orbitals are formed by linear combinations of atomic orbitals, and the bonding molecular orbital is ~a + %.14 The molecular wave function is the product of the wave functions for the two electrons or [~a(1) + (~b(1)][~a(2) + %(2)]. When the energy is calculated using this wavefunction, it is found to be 80.0 kcal/mol with a length of 0.732A. In this case, the valence bond result is slightly more satisfactory than the molecular orbital result, but neither is really satisfactory. If we multiply out the MO wavefunction, we obtain V = {a(1)~b(2) + ~a(2)~b(1) + ~a(1)~a (2) + %(1)~b(2) It can be seen that the first two terms are the same as the valence bond wavefunction, and there are an additional two terms. The first two are c o m m o n l y called the covalent terms because each has the electrons associated with both centers. The final two are k n o w n as ionic terms because each places two electrons at o n e center (i.e. H § H ~ and H H*). Valence bond theory generally neglects ionic terms of this type, whereas in MO theory the covalent and ionic terms are treated equally. A better result can be obtained if the proportion of covalent and ionic terms can be adjusted. This can be done by mixing the doubly excited state with the g r o u n d state MO wavefunction. The new wavefunction is then V = [~)a(1) + ~b(1)][4)a(2) + r
+ a[~a(1)- %(1)][~a(2)- %(2)]
The effect of introducing the second terms can best be seen by multiplying out the wavefunction giving = (l+a)[ (~a(1)(~a(2) + (~b(l~b(2)] + (l-a)[ (~a(l~b(2) + (~a(2)(~b(1)] It can be seen that if a = 0, this becomes the MO wavefunction and that if it is -1, it is the Heitler-London wavefunction. With a=-0.59, a binding energy of 92 k c a l / m o l with a b o n d length of 0.748./k is obtained. 15 This is a significant i m p r o v e m e n t . The calculated binding energy may be further improved by using larger basis sets and extensive configuration interaction, and can be made to closely agree with the experimental values. 16 The purpose of the above digression is to indicate that neither VB or MO theory will give close to an exact answer unless configuration interaction is included in the calculation. It is reasonable to believe that both VB and MO will give exactly the same result if both large basis sets and extensive configuration interaction are employed. The reality is that we know how to push MO theory to near the limit, but we do not k n o w how to do VB calculations in a direct way. One n o r m a l l y
523 carries out a MO calculation at the HF level (including no configuration interaction), and then converts the MO wavefunction into VB wavefunctions. 17 Limited correction for electron correlation is generally effected by using Goddard's Generalized Valence Bond model (GVB) 18 or some variant of it. Additional configuration interaction would lead to mixing of the localized valence bond units, and would destroy the valence bond type of representation. It should be recognized that both the molecular orbitals and the valence bond counterparts are just mathematical constructs that facilitate the calculation of the properties of a molecule. They have no physical meaning. The physically meaningful quantity from any of these calculations is the charge density distribution that may be derived from the total wavefunction. From this distribution one can obtain all of the measurable ground state properties of a molecule, including the energy, dipole and higher electrical moments, etc. 19 3. MANIFESTATIONS OF "AROMATIC" STABILIZATION Several quantities have been used to obtain a measure of aromatic stabilization. They include: a. substitution rather than addition b. stabilization energy c. bond lengths d. NMR ring currents e. spectroscopic properties As noted above, the first definition of "aromaticity" was in terms of substitution rather than addition. This is certainly true for many benzene derivatives. However, it must be used with some care since thiophene is by most criteria about as "aromatic" as benzene, but when treated with chlorine or bromine it gives an addition product. The latter is, however, the kinetically controlled product, for when heated or treated with base it loses hydrogen halide and gives the 2halothiophene. 2~ Compounds such as anthracene and phenanthrene, which are recognized as having considerable resonance stabilization, also undergo addition reactions. Thermochemical stabilization is probably the most generally applicable of the simple criteria for "aromaticity". Pauling made use of heat of combustion of benzene and a set of average bond energies to derive a resonance energy of 37 kcal/mol, m The most useful measure of this quantity is derived from the heats of hydrogenation (kcal/mol) obtained by Kistiakowsky and his coworkers: 22
+6
-26
-28
524 The first step in the reduction is endothermic, accounting for the difficulty in hydrogenating benzene. A m i n i m u m value of the stabilization is given by the difference between the first and last steps in the above sequence, or 34 kcal/mol. It is believed that 1,3-cyclohexadiene is stabilized by about 2 kcal/mol, leading to the commonly stated 36 kcal/mol resonance energy of benzene. The resonance energies, or perhaps better, stabilization energies, are really not wholly satisfactory for the bond lengths in cyclohexene and benzene are not the same. There is an additional term, the "compression energy" required to make the bond lengths the same. Simpson has provided a useful way in which to consider the formation of benzene from the Kekule structures. 23 The square of a wavefunction represents a structure:
~/12 =
~2 2 =
and the energy of the Kekule structures will be given by E1 = J ~1 H~ ~a dr
E2 =~ V2 H~ V2 dl:
where H ~ is the appropriate Hamiltonian operator. The two energies are, of course, the same. In order to convert the Kekule structures to the geometry appropriate for benzene, the bond lengths must be changed. This may be considered as a perturbation. Thus
H=H~ where V is the perturbation operator. Here, this operator will lead to the following
Vll = J ~aV~ld~ = J ~l/2V~2dl: and
V12 = J ~1V~2 dl; Then since E 1 = E2, we write a secular determinant of the form E 1 + Vll - E
V12 =0
V12 and
E1 + V l l - E
525
E = E 1 + Vll + V12 The relationship between these quantities is shown in Figure 1. The compression energy may be estimated to be about 15 kcal/mol from a calculation of the change in energy with a change in bond lengths for benzene. 24 The structures having the same bond lengths mix to form a ground state and an excited state. The excited state that has the correct symmetry is found at 260 nm corresponding to 110 kcal/mol above the ground state. Thus, V12 = 55 kcal/mol. The estimated thermochemical resonance energy is then about 40 kcal/mol, in remarkable agreement with the observed value. 2s
-V. 2
compression , energy v" I
vertical stabilization energy
I
thermochemica ~
stabilization energy
Excited state
k +V12
Ground state _1_.
Figure 1. Relationship between vertical and thermochemical stabilization energies and the compression energy. This treatment may give us confidence in the use of the thermochemical stabilization energies of benzenoid compounds. Good values of the heats of formation of many of these compounds in the gas phase are not available. Therefore it is useful to examine some calculated values. The total energies of some annelated benzenes at the B3LYP/6-311G** level are given in Table 1. The zero-point energies of isomers in this series are essentially constant 26 and therefore differences in energy between isomers may be directly compared with experimental values. The experimental difference in energy between anthracene and phenanthrene is well reproduced. The calculated energy difference between naphthacene and chrysene is larger than the experimental difference, but the latter has considerable uncertainty
526 because of the difficulty in measuring the heats of sublimation at room temperature for compounds with high melting points. Table 1. Calculated energies, B3LYP/6-311G** Compound Be n z e n e Naphthalene Anthracene Naphthacene Pentacene Phenanthrene Chrysene Picene a. b. c. 1,
Energy a -232.30855 -385.98493 -539.65518 -693.32294 -846.98953 -539.66320 -693.33905 -847.01581
AEb
aHf r 19.7+0.2 35.9-!-_0.3 55.220.5 69.9+_2.2
-5.0 -10.1 -16.5
49.6+0.4 64.5+1.6
a a H f b Stab. E. per CC bond 36 12.0 60 12.0 80 11.5 99 11.0 117 10.6 5.9-!-_0.7 85 12.2 5.4~.7 109 12.1 133 12.1
Total energies are given in Hartrees, the other energies are given in kcal/mol. Relative energy with respect to the isomeric linearly annelated hydrocarbon. Pedley, J. B. "Thermochemical Data and Structures of Organic Compounds," Vol. Thermodynamics Research Center, College Station, TX, 1994.
Using the experimentally determined stabilization energies for benzene and naphthalene (36 and 60 kcal/mol respectively), one may derive the following expression for the stabilization energy of the benzenoid hydrocarbons based on the ab initio calculations: Stab. Energy = 627.5(E T + nCHX38.70853 + nCx38.11054) where E T is the calculated total energy, 627.5 is the conversion factor for Hartrees to kcal/mol, nCH is the number of CH groups and n c is the number of quaternary carbons. The energies thus derived are summarized in Table 1. It can be seen that the stabilization energy per C-C bond is constant for benzene, naphthalene and the [n]-phenacenes, but decreases with increasing number of benzene rings with the linearly annelated arenes. It is known that whereas benzene and naphthalene are relatively unreactive, the remaining compounds have much higher reactivity toward electrophiles and 1,2- or 1,4-additions. In the case of benzene, addition across a double bond will cause a 34 kcal/mol loss of stabilization (36 less 2 kcal/mol for the butadiene fragment) and addition to naphthalene will cause a 22 kcal/mol loss of stabilization (i.e. 60 k c a l / m o l less 38 kcal/mol for the remaining styrene unit). On the other hand, addition across the 9,10 positions in anthracene will only lead to an 8 kcal/mol loss
527 of stabilization (80 less 2x36) and addition to phenanthrene will only lead to a 13 k c a l / m o l (85 less 2x36) loss of stabilization. Thus, despite comparable stabilization energies per double bond, the latter two compounds would be expected to be m u c h more reactive than the former. With pentacene, the stabilization energy is less than that of the two naphthalene units formed by 1,4 addition across the central ring, and this corresponds to its high reactivity. On the other hand, the isomeric picene w o u l d have an energy change comparable to that of phenanthrene, thus accounting for its greater stability. The 4n cyclic polyunsaturated compounds have been of much interest. A B3LYP/6-311G** calculation for cyclobutadiene gave an energy of -154.71851, and using equation 1, the apparent stabilization energy is -73 kcal/mol. This must, however, be corrected for the strain present in the three-membered ring which should be on the order of 30 kcal/mol. Thus, the n system is destabilized by about 40 kcal/mol. This fits in well with the concept of "antiaromaticity" in which it is suggested that destabilization occurs with the 4n molecules. 27 In the case of planar cyclooctatetraene, the calculated energy is -309.64811, indicating that it is destabilized by 13 kcal/mol. This should be close to the strain associated with the 135 ~ C-C-C angles in the ring, and thus in this molecule the n system is neither stabilized nor destabilized. Addition of two electrons leads to a filled HOMO level, and now the ion is planar despite the strain associated with this geometry. There are several types of aromatic systems in addition to the ones described above. On example is azulene, a member of a class of nonalternant conjugated hydrocarbons. 28 As a result of the different bridging pattern as compared to benzene, the n-energy levels are considerably shifted leading to light absorption in the visible region, and a purple color, as well as reduced n-electron stabilization. 4. SIGMA CONTRIBUTION TO THE GEOMETRY OF BENZENE Most recent studies have concluded that the regular hexagonal geometry of benzene results from the t~ bonds rather than the n bonds. This is reasonable since the c~ C-C bonds are stronger than the n bonds by about 20 kcal/mol. One w o u l d expect the c bonds at each carbon would have the same hybridization if at all possible, and thus in the absence of any other overriding factor, benzene should adopt its observed geometry in order to minimize its c~ energy. The structural preference for the n-bonds is not as clear. Berry noted that the antisymmetric C-C stretching mode, which effectively interconverts K e k u l e type structures, had an unusually low frequency, and concluded that the n-system preferred a Kekule-like structure. 29 Similar conclusions have been derived from
528 other studies of benzene. 3~ However, the most recent study has concluded that the a and n electrons both prefer the hexagonal structure. 31 The group of simple annelated benzenes in Table 2 provide a c o n v e n i e n t context for examining the role of the n-electrons in determining bond lengths. 24 With benzene, naphthalene and anthracene for which good structural data are available, there is very good agreement between the calculated and observed C-C bond lengths. These lengths vary over a rather wide range, 1.339-1.451 /L It is possible to calculate bond indices from the ab initio wavefunctions using the m e t h o d developed by Fulton and Mixon. 32 Further, the total bond index may be separated into its c~ and n components. In the case of benzene, the C-C ~ bond order is 0.425, and the c~ bond index is 0.965. ~3 It is found that whereas the n components cover a wide range, from 0.21 to 0.52, the c~ components only vary from 0.92 to 0.98. Therefore the c~-bonding is relatively insensitive to bond lengths, and the latter is mainly controlled by the ~terms. They are related to the bond lengths as shown in Figure 2.
0.60 0.50 ,-70 o
"1o c-.
o c'~
0.40
-
0.30
-
o
0.20
.--,.,
r-,
0.10 0.0 -0.10
1.35
'
'
'
I
1.40
'
'
'
'
i
1 45.
'
length
'
'
'
I
1.50
'
'
'
'
1.55
Figure 2. Relationship between the n-bond index and the C-C bond length for benzenoid hydrocarbons. It has been suggested that equalization of bond lengths is one characteristic of aromatic compounds. 34 However, the data in Table 2 shows that this is not a requirement for stabilization, and also shows that the n-system does have a large effect on the structure.
529 5. MAGNETIC PROPERTIES It is known that benzene and the other well recognized aromatic compounds have an unusually large diamagnetic susceptibility. 35 This is presumably due to the ring current that can be induced in these compounds in the presence of a magnetic field. 36 These ring currents will lead anisotropic susceptibilities with the tensor component normal to the aromatic ring being much larger than the in-plane components. 37 It is interesting to note that in contrast to the above, the antiaromatic compounds are calculated to have significant paramagnetic susceptibilities. 38 Although the susceptibility is a useful criterion, it is not so easily measured experimentally. One of the more useful criteria for "aromatic" character is derived from n m r chemical shifts. It is known that the protons of benzene are found at a lower field than ordinary olefinic protons, and it has been attributed to the ring current in the n-system which will reinforce the applied field at the protons. This has found confirmation in the observation that protons placed over an aromatic ring will be shifted upfield. 39 A particularly striking effect is found with the planar cyclooctadecanonaene in which the outer hydrogens have an unusually large downfield shift (8 9.28) and the inner hydrogens have a remarkably large upfield shift (8 -2.99).40 H H H H
H H
H
H H Schleyer has suggested that the calculated magnetic shielding at the center of a ring might be a useful indicator of aromatic character. 4~ Here, it was found that a plot of the shielding against the aromatic stabilization energies for a series of fivemembered ring heterocycles gave a good linear relationship. Cydopentadiene gave a very small shielding, and it went to -19 ppm with the well stabilized cyclopentadienyl anion. The introduction of a BH into the ring, which should lead to a 4e antiaromatic system, was calculated to give a +17 ppm (i.e., paramagnetic) shielding. Many aromatic and antiaromatic compounds were included in this study, and it provides the best justification for the use of NMR chemical shifts in studying
Table 2. Calculated and observed structures of acenes. Compound
Method
Benzene
Theor
1.394
IRa
1.390
n-Index
0.425
Naphthalene (1) Theor Xray
Anthracene (2)
b
1.375
C
1.415
D
E
G
H
i
J
K
1.425(1) 1.378(1) 1.421(1) 1.426(1)
n-Index
0.302
0.519
0.325
0.287
Theor
1.429
1.367
1.424
1.443
Xray
1.434(1) 1.369(1) 1.431(1) 1.441(1) 1.403(1)
ED^
1.437(4) 1.397(4) 1.422(16)1.437(4)1.392(6)
n-Index
0.268
b
F
1.431
1.422(3) 1.381(2) 1.417(4) 1.412(8)
0.557
0.286
0.245
1.398
0.380
1.433
1.364
1.429
1.450
1.390
1.409
1.450
n-Index
0.253
0.573
0.269
0.224
0.416
0.336
0.222
Theor
1.437
1.361
1.434
1.454
1.385
1.415
1.453
1.401
n-Index
0.245
0.585
0.257
0.213
0.436
0.314
0.210
0.370
1.413
1.378
1.406
1.380
1.413
1.424
1.434
1.356
Phenanthrene (5)Theor
Chrysene ( 6 )
1.420
B
ED'
Naphthacene (3) Theor Pentacene (4)
A
VI
1.457
Xray'
1.428(9) 1.374(17) 1.386(14)1.399(15)1.412(8)1.416(8) 1.450(7) 1.341(10) 1.468(10)
n-Index
0.330
0.486
0.336
0.484
0.341
0.316
0.241
0.592
0.213
Theor
1.415
1.376
1.407
1.378
1.416
1.425
1.425
1.361
1.452
1.430
1.415
1.428
1.363
1.394
1.381
1.409
1.409
1.421
1.368
1.468
1.428
1.401
f
Xray
Picene (7)
n-Index
0.320
0.497
0.325
0.495
0.329
0.308
0.260
0.564
0.231
0.267
0.362
Theor
1.414
1.376
1.414
1.379
1.415
1.424
1.427
1.360
1.452
1.433
1.4Mg
n-Index
0.323
0.494
0.349
0.491
0.332
0.310
0.254
0.572
0.225
0.260
0.347
a. Pliva, J.; Johns, J. W. C.; Goodman, L. J. MoZ. Spectrsoc. 1991, 248, 427.
b. Brock, C. P.; Dunitz, J. D.; Hirshfeld, F. L. Acta Cryst. 1991,B47, 789. c. Ketkar, S. N.; Fink, M. J. Mol. Struct. 1981,77, 139. d. Ketkar, S. N.,; Kelley, M.; Fink, M.; h e y , R. C. J. Mol. Struct. 1981, 77, 127.
e. Kay, M. I.; Okaya, Y.; Cox, D. E. Acta Cyst. 1971,B27,26. f. Cruickshank, D. W. J.; Sparks, R. A. Proc. Roy. Soc. (London) 1960,A258, 270.
g. L = 1.446 (0.253), M = 1.420 (0.290), N = 1.366 (0.535).
1
2
3
4
532 aromaticity. Some interesting observations were that benzene and n a p h t h a l e n e have the same shielding at the center of their rings (-11.5 ppm), and that the center ring of anthracene has greater shielding at its center (-14.3 ppm), and the opposite is true for the outer rings (-9.4 ppm). The magnetic criterion is, however, not a unique indicator of aromaticity. The linear p o l y m e t h i n i u m ions: 42 + (CH3)2N .-- (CH=CH)n-- CH=N(CH3)2 meet most of the requirements. Unlike the linear polyenes that have only a weak conjugation between the double bonds, and have alternating single and double bonds, these ions have essentially equal C-C bond lengths. The ~ electrons are delocalized in the same m a n n e r as with benzene, i.e. one n electron per C-C bond, and just as with benzene, the electronic spectra can be predicted using the free electron model. Although fitting most of the criteria for aromaticity, these ions cannot have significant magnetic properties. The latter requires a cyclic conjugated system. 6. ORIGIN OF THE STABILIZATION OF BENZENE. Although it is easy to demonstrate that benzene and other "aromatic" systems are stabilized, it is not as easy to determine the exact origin of the stabilization. Both valence bond and molecular orbital theories can provide a formalism for "explaining" the stabilization, and the latter can quantitatively account for the energy of benzene and its low reactivity. However, they do not provide a physical model for the stabilization. The latter must come from a consideration of the electron density distribution, for that alone determines the energy of a molecule. Dewar and Schmeising in 1959 provided a simple explanation for the stabilization of benzene which is probably very close to the correct answer. 43 They pointed out that the essential difference between a Kekule structure and benzene itself is that in the former the n-electrons are paired, whereas in the latter they are arranged one n-electron per bond. This delocalization of the n-electrons will reduce their electron repulsion, and result in a net decrease in the total energy. One may come to this conclusion by considering either the VB or the MO representations for benzene. 7. HETEROCYCLIC AROMATIC SYSTEMS. The replacement of CH groups of benzene by nitrogen leads to the azines. Pyridine, pyrazine and pyrimidine have essentially the same stabilization as benzene. Thus, hydrogen transfer reactions between 1,3-cyclohexadiene and the
533 azine to give benzene and a dihydroazine are calculated to be close to t h e r m o n e u t r a l . 44 A m o n g the diazines, only pyridazine, having a N=N bond, has reduced stabilization. The n-electron populations are found to be somewhat greater at the more electronegative nitrogens than at the carbons, but this appears to have little effect on the n-electron stabilization. The replacement of -CH=CH-of benzene by oxygen or sulfur to give furan and thiophene again leads to a 6 n-electron system. The higher electronegativity of oxygen leads to a concentration of n-electrons at oxygen, and significantly reduced nstabilization. Thus, in many ways it acts as an enol ether, and it readily undergoes the Diels-Alder reaction. On the other hand, thiophene with a sulfur that has about the same electronegativity as carbon, has stabilization and properties similar to that of benzene. 45 The replacement o f - C H = C H - o f benzene by BH would lead to a 4 n-electron system which should be antiaromatic, and it has been calculated to have a paramagnetic shift at the center of its ring, in contrast to the diamagnetic shift characteristic of the 6 n-electron systems. 41 8. SUMMARY. Aromaticity may be considered as the stabilization of a molecule that results from x-electron delocalization in a closed shell system. It has a number of chemical manifestations such as an endothermic first hydrogenation energy, a tendency toward bond length equalization, enhanced diamagnetic susceptibilities and characteristic nmr chemical shifts. However, as noted above, none of these are really unique. Polycyclic aromatic compounds having significant n-electron stabilization often have a variety of bond lengths. Linear aromatic systems cannot have special magnetic properties since such properties result from cyclic delocalized systems. In discussing resonance, Pauling stated "The theory of resonance in chemistry is essentially a qualitative theory, which like the classical structural theory, depends for its successful application largely upon a chemical feeling that is developed through practice. "46 The same is true with aromaticity.
534 References:
1 Aromatic is defined as a substance "characterized by a fragrant smell, and usually a by a warm, pungent taste." (Merriam-Webster New International Dictionary). Reviews: Garratt, P. J. "Aromaticity," Wiley, NY 1986. Minkin, V. J.; Glukhovtsev, M. N.; Simkin, B. Y. "Aromaticity and Antiaromaticity: Electronic and Structural Aspects," Wiley, NY 1994. 2 Pauling, L. J. Chem. Phys. 1933, 1,280. Pauling, L.; Wheland, G . W . J . Chem. Phys. 1933, 1,362. Pauling, L.; Sherman, J. I. Chem. Phys. 1933, 1,679. 3 H/ickel, E, Z. Physik. 1931, 70, 204. H/ickel, E. Z. Elektrochem. angew, physik. Chem. 1937, 42, 827. 4 Ingold, C. K. "Structure and Mechanism in Organic Chemistry," Cornell Univ. Press, Ithaca, NY 1953, p. 238. 5 When two or more significant resonance structures may be written for a molecule, it will be stabilized with respect to the basic structure. As a result, the energy required to reach the transition state will be increased unless there are corresponding interactions in the transition state. The latter will be the case for electrophilic substitution on anisole. 6 C5H5-: Bordwell, F. G.; Drucker, G. E.; Fried, H. E. J. Org. Chem. 1981, 46, 632. C7H7+: Doering, W. v. E.; Knox, L. H. J. Am. Chem. Soc. 1954, 76, 3203. C8H8-2: Katz, T. J. J. Am. chem. Soc. 1960, 82, 3784. 7 This is an example of Jahn-Teller distortion: Jahn, H. A.; Teller, E. Proc. Roy. Soc. 1937, A161, 220. 8 Breslow, R.; Mazur, S. J. Am. Chem. Soc. 1973, 95, 584. 9 Bastiansen, O.; Hedberg, L.; Hedberg, K. J. Chem. Phys. 1957, 27, 1311. 10 Wheland, G. "The Theory of Resonance and its Applications to Organic Chemistry." Wiley, NY. 1944, p. 94. 11 Kawajima, S. J. Am. Chem. Soc. 1984, 106, 6496. Maynau, D.; Malrieu, J.-P. J. Am. Chem. Soc. 1982, 104, 3029. Mulder, J. J. L.; Oosterhoff, L. J. Chem. C o m m u n . 1970, 305, 307. 12 Heitler, W.; London, F. Z. Physik. 1927, 44, 455. 13 Wang, S. C. Phys. Rev. 1928, 31,579. 14 Coulson, C. A. Trans. Faraday Soc. 1937, 33, 1479. 15 Weinbaum, S. J. Chem. Phys. 1933, 1,593. 16 In this simple example, configuration interaction serves mainly to correct for the use of a minimal basis set, and allows correct dissociation. When m o r e complete basis sets are used, configuration interaction serves to correct for electron correlation. 17 Cf. Schultz, P. A.; Messmer, R. P. J. Am. Chem. Soc. 1993, 115, 10943.
535 18 Bobrowicz, F. W.; Goddard, W. A., UI, in "Methods of Modem Electronic Structure Theory. Vol. 3. Modern Theoretical Chemistry." Schaefer, H. F., UI, Ed. Plenum Press, NY 1977. ~9 Parr, R. G.; Yang, W. "Density-Functional Theory of Atoms and Molecules," Oxford Univ. Press, NY, 1989, p. 51ff. 20 Coonradt, H. L.; Hartaugh, H. D. J. Am. Chem. Soc. 1948, 70, 1158. Blicke, F. F.; Burckhalter, J. H. J. Am. Chem. Soc. 1942, 64, 477. 21 Pauling, L. "The Nature of the Chemical Bond," 3rd Ed, Cornell Univ. Press, Ithaca, 1960, p. 193. 22 Kistiakowsky, G. B.; Ruhoff, J. R.; Smth, H. A.; Waughan, W. E. J. Am. Chem. Soc. 1935, 57, 876; 1936, 58, 237, 146. 23 Simpson, W. T, J. Am. Chem. Soc. 1953, 75, 597. 24 Wiberg, K. B. J. Org. Chem. 1977 to be published. 25 It should be noted that there is not general agreement on how stabilization energies should be calculated. Cf. Chestnut, D. B.; Davis, K. M. J. Comput. Chem. 1997, 18, 584. 26 Cioslowski, J.; Liu, G.; Martinov, M.; Piskorz, P.; Moncrieff, D.J. Am. Chem. Soc. 1966, 118, 5261. 27 Breslow, R. Acct. Chem. Res. 1973, 6, 393. 28 Coulson, C. A.; Longuet-Higgins, H. C. Proc. Roy. Soc. London 1947, A192, 16. 29 Berry, R. S. J. Chem. Phys. 1961, 35, 2253. 30 Shaik, S. S.; Bar, R. Nouv. J. Chim. 1984, 8, 411. Hiberty, P. C.; Shaik, S. S.; Lefour, J.-M.; Ohanessian, G. J. Org. Chem. 1985, 50, 4657. Shaik, S. S.; Hiberty, P. C.; Lefour, J.-M.; Ohanessian, G. J. Am. Chem. Soc. 1987, 109, 363. Shaik, S. S.; Hiberty, P. C.; Lefour, J.-M.; Ohanessian, G. J. Phys. Chem. 1988, 92, 5086. Hiberty, P. C.; in Topics in Current Chemistry, Gutman, I.; Cyrin, S. J. Eds. Springer: New York, 1990, Vol. 153, p. 27. Jug, K.; Koster, A. M. J. Am. Chem. Soc. 1990, 112, 6772. 31 Glendening, E. D.; Faust, R.; Streitwieser, A, Vollhardt, K. P. C; Weinhold, F. J. Am. Chem. Soc. 1993, 115, 10952. Cf. I-Iiberty, P. C.; Ohanessian, G.; Shaik, S. S.; Flament, J. P. Pure Appl. Chem. 1993, 65, 35. 32 Fulton, R. L. J. Phys. Chem. 1993, 97, 7516. Fulton, R. L.; Mixon, S. T. J. Phys. Chem. 1993, 97, 7530. 33 The n-bond index is less than 0.5 because there are also small 1,3 and 1,4 contributions. 34 Julg, A.; Francois, Ph. Theor. Chim. Acta 1967, 7, 249. Cf. Minkin, et. al., ref. 1. 35 Pauling, L. J. Chem. Phys. 1936, 4, 673. Dauben, H. J., Jr.; Wilson, J. D.; Laity, J. L. J. Am. Chem. Soc. 1968, 90, 811; 1969, 91, 1991. Dauben, J. J., Jr. "Diamagnetic
536 Susceptibility Exhaltation as a Criterion of Aromaticity," in "Non-Benzenoid Aromatics," Snyder, Ed., Vol 2, Academic Press, NY, 1971. 36 Pople, J. A. J. Chem. Phys. 1956, 24, 1111 37 Benson, R. C.; Flygare, W. H. J. Am. Chem. Soc. 1970, 92, 7523. Hurter, D. H.; Flygare, W. H. Top. Curr. Chem 1976, 63, 1976. 38 Schleyer, P. v. R.; Jiao, H. Pure and Appl. Chem. 1996, 68, 209. 39 Cf. Vogel, E.; Roth, H. D. Angew. Chem. Int. Ed. Engl. 1964, 3, 228. Gaoni, Y.; Malera, A.; Sondheimer, F.; Wolovsky, R. Proc. Chem. Soc. 1964, 397. Boekelheide, V.; Phillips, J. B. J. Am. Chem. Soc. 1967, 89, 1695. 40 Baumann, H.; Oth, J. F. M. Helv. Chim. Acta 1982,65, 1885. 41 Schleyer, P. v. R.; Maerker, C.; Dransfeld, A.; Jiao, H.; van Eikema Hommes, N. J. R. J. Am. Chem. Soc. 1996, 118, 6317. 42 Dauben, H. J., Jr.; Feniak, G. unpublished results. Cf. Feniak, G. Ph.D. Thesis, University of Washington, 1955. Wiberg, K. B. "Physical Organic Chemistry," Wiley, NY 1965, p. 9. D~me, S.; Hoffmann, K. Prog. Phys. Org. Chem. 1990, 18, 1. 43 Dewar, M. J. S.; Schmeising, H. N. Tetrahedron 1959, 5, 166. 44 Wiberg, K. B.; Nakaji, D.; Breneman, C. M. J. Am. Chem. Soc. 1989, 111, 4178. 45 Gronowitz, G., Ed. "Thiophene and Its Derivatives," Wiley/Intersciene, NY 1985. 46 Ref. 21 p. 220.
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
537
H y p e r c o o r d i n a t e b o n d i n g to m a i n g r o u p e l e m e n t s : t h e s p i n - c o u p l e d p o i n t of v i e w David L. Cooper a, Joseph Gerratt b and Mario Raimondi c aDepartment of Chemisty, University of Liverpool, P.O. Box 147, Liverpool L69 3BX, United Kingdom bSchool of Chemistry, University of Bristol, Cantocks Close, Bristol BS8 1TS, United Kingdom CDipartimento di Chimica Fisica ed Elettrochimica, Universit~ di Milano, Via Golgi 19, 20133 Milano, Italy Modern valence bond theory, in its spin-coupled form, has been used to investigate the nature of hypercoordinate bonding to main group elements. The systems that have been studied in this way include: halides, oxides and oxohalides of phosphorus, sulfur and chlorine; xenon fluorides; SiH~ and CH~; 1,3-dipoles; oxohalides of hypercoordinate nitrogen; fluorophosphoranes; and YXXY dihalides and dihydrides of dioxygen and disulfur. The bonds involving hypercoordinate atoms tend to be highly polar. We find no significant qualitative differences between the hypercoordinate nature of first-row, second-row and noble gas atoms in appropriate chemical environments, nor between the descriptions of the bonding in hypercoordinate and so-called 'normal octet' molecules, except for some differences in bond polarity. We suggest that the 'octet rule' be demoted in favour of the democracy principle: almost all valence electrons can participate in chemical bonding if provided with sufficient energetic incentives. Simple concepts of atomic size and of electronegativity differences prove to be of particular utility in qualitative descriptions. We find no evidence for the utilization of d functions as valence orbitals, or to support notions of p~-d~ back-bonding. 1. I N T R O D U C T I O N
In spite of all the theoretical evidence, accumulated over many years, it is still commonplace for students to be taught that the existence of hypercoordinate molecules such as SF~ and PF 5 relies on the utilization of d orbitals to 'expand the octet'. Indeed, even models based on d2sp3, dsp 2 and dsp 3 hybrid orbitals or p~-d~ back-bonding are still in use to describe hypercoordinate bonding to second-row elements. Of course, the consensus view that has emerged from most of the
538 reliable ab initio investigations published in recent years is that d functions act as polarization functions for second-row atoms, compensating for the inflexibility of s/p basis sets, albeit to a somewhat greater extent t h a n for first-row atoms. See, for example, Refs. 1-11, and references therein. Certainly, it is not justified to regard these d functions as valence orbitals. There are m a n y texts that make the point very clearly t h a t the bonding in a molecule such as SF, has very little to do with the availability of d atomic orbitals, but this is normally done in the context of MO theory, whereas the general ideas of utilizing d orbitals are much more closely allied with the ideas of classical valence bond theory. This, perhaps, is one of the reasons for the continued survival of such models. The purpose of this Chapter is to describe various calculations which have been performed using modern valence bond theory, in its spin-coupled form, resulting in a useful aide memoire which we term the democracy principle. We argue that there are no significant qualitative differences between the hypercoordinate nature of first-row, second-row and noble gas atoms in appropriate chemical environments. 2. d - O R B I T A L P A R T I C I P A T I O N V E R S U S D E M O C R A C Y
The basic methodology that we have used to study the chemical bonding to hypercoordinate main-group elements is much the same as t h a t described in an earlier Chapter: "The spin-coupled description of aromatic, antiaromatic and nonaromatic systems". The spin-coupled wavefunctions for the systems considered here take the form
~'~
=
2 2 20~:: A_ { ~,~...~,, O~O~... ONOo~}
(1)
in which the ~ are distinct, singly-occupied, non-orthogonal orbitals for the N 'active' electrons, O~ is an N-electron eigenfunction of S~ and Sz for S=O (and thus M=0), and A is the antisymmetrizer. The doubly-occupied orbitals r accommodate the 'inactive' electrons and O~'~is the corresponding perfectly-paired spin function. The total wavefunction is invariant to normalization of the orbitals, to orthogonalization of the q); amongst themselves, as well as to orthogonalization of the active orbitals to the inactive ones. On the other hand, once the division into inactive and active spaces has been made, the converged spin-coupled wavefunction is a unique outcome of the variational procedure: it is not invariant to general linear transformations of the 0~The basic strategy adopted for 'normal octet' and hypercoordinate molecules XY,, was first to carry out a standard closed-shell RHF calculation and then to localize the orbitals according to the population or overlap criterion introduced by Pipek and Mezey [12]. In all cases, it was straightforward to identify localized molecular orbitals (LMOs) associated with particular X ~ Y bonds: Visual inspection of the bond LMOs for various phosphorus and sulfur halides revealed no evidence for the active participation of d orbitals. For example, not
539 only are the P - - F LMOs in PF 3 very similar to those in PFs, but there is little change on excluding d functions from the calculation. Furthermore, contrary to the expectations of the standard dsp 3 hybridization model of the bonding in PFs, it was difficult to distinguish P--Fa~ from P--Fe, , as is demonstrated in Figure 1.
P-Fax
P-Feq
;,---, \
i af-x~
I
F
F
P - - F eq
F
P-Fax
/'Z:'-.
; ,".--,: ; tl
tJ--
t]
I
I I l lLLI~Y,-,,,
,'
no dp
no
@
Figure 1" Localized molecular orbitals for PF 5. In the subsequent spin-coupled calculations [9], the ~, were optimized as linear combinations of all the LMOs which correspond to X--Y bonds, plus all the virtual orbitals. This scheme is entirely equivalent to expanding the ~, in the full b~sis atomic basis set, except that it maintains the orthogonality between the active orbitals and the inactive space, which consists of all the other doublyoccupied MOs. The spin function O0N was fully optimized in the full spin space. The active orbitals were thus fully optimized without constraints on their form, on the degree of localization, on the overlaps between them, or on the mode of coupling the electron spins. Nevertheless, we found for each molecule t h a t the optimized spin-coupled orbitals consist of pairs, each clearly associated with a particular two-centre bond, and with predominantly singlet coupling of the electron spins. For example, we show in Figure 2 the pair of spin-coupled
540 orbitals, ~1 and ~2, associated with one of the S---F bonds in SF 6. The first of these orbitals is a two-centre function: it comprises the combination of an spX-like hybrid from sulfur with some 2p character from fluorine. The second orbital is largely a 2p function on fluorine. Spin-coupled orbitals ~3-~ can be obtained from ~ and ~2 by symmetry operations of the molecular point group. It is clear from the parentage of the r that the S---F bonds in SF 6 bonds are very polar. Perfect-pairing dominates the total spin function, with a contribution in excess of 99%. The dominant orbital overlaps (-0.8) occur of course within the pairs that describe each bond, but members of different pairs are not orthogonal: for example, the overlap between different sulfur+fluorine hybrids is -0.3. The form of the spin-coupled orbitals, the overlaps between them and the mode of spin coupling change very little if d functions are excluded from the calculations. Perhaps not surprisingly, we could find no evidence to support a traditional d2sp 3 model of the bonding.
1
r
Figure 2: Symmetry-unique spin-coupled orbitals for SF 6. The spin-coupled description of the bonding in PF 5 resembles that in SF6, with each bond described by the overlap of a phosphorus+fluorine hybrid, split almost equally between P and F, and a distorted F(2p) orbital. The bonds are clearly highly polar. It was difficult to discern differences between the various phosphorus+fluorine hybrids or fluorine orbitals in equatorial and axial positions. In the standard dsp :~ hybridization model of PF~, the three equatorial bonds, based on P(sp2), are supposed to be somewhat different from the two axial bonds, based on P(pd). There is no evidence for this in the spin-coupled calculations or for significant involvement of d orbitals in the bonding. It is useful to bear in mind, contrary to the expectations of the dsp 3 model, that the two sets of bond lengths are very similar, differing by no more t h a n expected from different steric repulsions, and that the two sets of 19F chemical shifts are very similar.
541 Calculations of this type have also been performed for PX n (n=3,5) and SX, (n=2,4) fluorides and chlorides [9]. Much the same basic picture emerges for all of these systems, whether 'normal octet' or hypercoordinate, with the variations in the amount of central-atom character in the two-centre spin-coupled orbital reflecting the polarity of the particular bond. Analogous descriptions were found to apply for XeF,, (n=2,4) and for SiX~ ions (X=H,F). The outcome of our own various numerical experiments with exponents of d basis functions are consistent with the findings of many earlier studies. Simple qualitative notions of significantly polar bonds are certainly of greater utility t h a n supposed d orbital participation in hybridization schemes. Optimum d exponent(s) for a given second-row element, as well as the energy improvement per bond, change very little from 'normal octet' to hypercoordinate systems. Furthermore, the nature of the attached groups and the expansion of the octet have less effect on d function exponents than does the effective nuclear charge of the second-row atom which bears them. From the point of view of the total energy, it can be much more efficient to put d functions on attached electronegative atoms than on the second-row atom. The (small) utilization of d functions on the central atom tends to diminish with increasing quality of s/p basis set. Based on our findings for all of the systems we have studied, we assert the
democracy principle: Almost all valence electrons can participate in chemical bonding if provided with sufficient energetic incentives. The 'constitution' of this principle is founded on the principle of minimizing the total energy, and it is ultimately this last criterion alone which determines how m a n y electrons a particular atom will utilize in chemical bonding. Nevertheless, we can identify features which are likely to be favourable, such as polar bonds which shift density away from a central atom, especially if the formal number of bonds is high. Differences in electronegativity can be a useful first guide to the possible existence of a particular hypercoordinate species. Of course, the sizes of atoms may be such that it is not possible in some cases to cram sufficient electronegative atoms around the central atom. This is one of the reasons why hypercoordinate bonding is likely to be less common for first-row atoms. However, it is clear from the chemistry of xenon, for example, in which the highest formal oxidation states are achieved in oxides and oxofluorides, that doubly-bonded oxygen atoms are at least as effective partners in hypercoordinate bonding as are fluorine atoms. One consequence of this line of reasoning is that in sterically limited situations, such as around first-row atoms, hypercoordinate molecules are more likely to feature at least some X=O highly polar double bonds t h a n larger numbers of X - - F bonds. Concentrating on the key aspects of size and electronegativity, it is easy to make various straightforward predictions/rationalizations, albeit some of them with more t h a n a little hindsight. Fluorine and, to a lesser extent, chlorine are
542 sufficiently electronegative relative to phosphorus for the formation of PF~ and PCI~, but nitrogen is too small for the formation of a stable NF~ molecule. However, the formal replacement of two N u F bonds by N=O should significantly reduce the steric crowding, such that it would be surprising if F3NO did not feature hypercoordinate bonding. We return to this issue in Section 3.2. Sulfur is more electronegative t h a n phosphorus, and it seems that fluorine can still form sufficiently polar bonds for species up to SX,, but that chlorine cannot. Even SC14 is generally formulated as SCI:~CI-, rather t h a n as a hypercoordinate species analogous to SF 4. Moving instead one to the left of phosphorus, it should come as little surprise t h a t silicon should form SiF~ and SiFt-. Indeed, even SiH~, and various derivatives, are stable intermediates that can be studied, for example, in flowing afterglows [13]. One of the models that has been proposed for the bonding in SiH~ invokes resonating axial bonds, one based on Si(3pz) and the other on overlap of H(ls) with an antibonding (~* orbital from the equatorial Sill 3 unit, with delocalization of the fifth valence electron into the equatorial S i - - H bonds [14]. The spin-coupled calculations [9], on the other hand, reveal a mode of bonding t h a t is analogous to that in SiF~ or PF~, and the descriptions of the axial and equatorial bonds S i u H are very similar to one another. Given the relative electronegativities of carbon and hydrogon, it is easy to rationalize why CH~ should be no more t h a n a high energy transition state in certain reactions. In the spin-coupled description of a molecule such as SF 6, the sulfur atom contributes six equivalent, nonorthogonal spX-like hybrids which delocalize onto the fluorine atoms. Each of these two-centre orbitals overlaps with a distorted F(2p) function and the perfect-pairing spin function dominates. Of course, using only 3s, 3px, 3p~ and 3pz atomic orbitals, we can at most form four linearly independent hybrid orbitals localized on sulfur, with a m a x i m u m occupancy of 8 electrons, as in the octet rule. However, the six sulfur+fluorine hybrids which emerge in the spin-coupled description are not linearly dependent, precisely because each of them contains a significant amount of F(2p) character. It is thus clear t h a t the polar n a t u r e of the bonding is crucial. Unlike classical valence bond theory, the spin-coupled approach does not presuppose the form of the orbitals or constrain them to be one-centred. Instead, each orbital is allowed to delocalize onto other centres as much or as little as is necessary to minimize the total energy. Numerous studies have shown t h a t a u g m e n t i n g the spin-coupled configuration with those in which one or more active orbitals is doubly-occupied, has a very modest effect on the total energy. The reference configuration remains dominant, so that the essential physical picture is unchanged. Even a very small utilization of basis functions from other centres can correspond in classical VB terms to the significant utilization of very large n u m b e r s of ionic structures. In the case of the spin-coupled description of the bonding SF6, the corresponding classical VB wavefunction involves resonance between the vast n u m b e r of possible ionic structures. There is nothing intrinsically wrong with such a description, except t h a t we believe it to be unnecessarily complicated.
543 In addition to asserting the democracy principle, that all valence electrons can in principle be used in bonding, we suggest that the much-loved octet rule should be demoted. The bonding in hypercoordinate species does not differ in any significant qualitative fashion from the bonding in 'normal octet' molecules. Furthermore, there are of course plenty of examples, ranging from carbenes to much of the chemistry of heavier main group elements ('inert pair effect'), for which the central atom does not achieve an octet. Of course, there is a great deal of evidence to support the notion that it can be favourable to achieve a formal count of four electron pairs around a central atom, often arranged in a pseudotetrahedral fashion, and so we should retain some sort of 8-electron rule. We should not, however, see such an 8-electron arrangement as some sort of norm, from which deviations have to be explained in terms of additional effects.
3. HYPERCOORDINATE BONDING TO FIRST-ROW ATOMS 3.1.1,3-dipoles Examples of 1,3-dipoles include diazoalkanes, nitrones, carbonyl ylides and fulminic acid. Organic chemists typically describe 1,3-dipolar cycloaddition reactions [15] in terms of four out-of-plane '~ electrons' from the dipole and two from the dipolarophile. Consequently, most of the interest in the electronic structure of 1,3-dipoles has been concentrated on the distribution of the four electrons over the three heavy atom centres. Of course, a characteristic feature of this class of molecules is that it presents awkward problems for classical valence theories: a conventional fashion of representing such systems invokes resonance between a number of zwitterionic and diradical structures [16-19]. Much has been written on the amount of diradical character, with widely differing estimates of the relative weights of the different bonding schemes. From various comparisons of the bond lengths of 1,3-dipoles with those of diatomic species, many authors have commented that the geometries of 1,3-dipoles appear to be consistent with hypercoordinate bonding at the central heavy atom. For example, the very short bond lengths between the heavy atoms in CH2N 2 suggest fully-formed C=N and N~-N multiple bonds. Similarly, the experimental bond lengths in N20 are very close to the values expected for an N - N triple bond and for an N=O double bond. It is tempting to represent this molecule as N-N=O but, according to Pauling [20], 'this formula suggests that the nitrogen atom can form five covalent bonds, which is not true'. Of course, this statement was linked to the octet rule and it seems worthwhile to challenge its validity. Indeed, there have been several attempts to explain how a nitrogen atom might acquire an apparent valency of five, including suggestions of resonance between 'increased-valence' structures built from undistorted atomic orbitals [21]. The mode of bonding which emerges from spin-coupled descriptions on various of 1,3-dipoles and related molecules [22-24] turns out to be closely analogous to the one that we have since established for second-row atoms. Calculations have
544 been performed at many levels, but always result in the same basic picture. For diazomethane, for example, spin-coupled calculations have been performed explicitly for the four 'out-of-plane' ~ electrons, accommodating the inactive electrons in doubly-occupied orbitals taken from either an RHF or an appropriate CASSCF calculation, or optimizing them simultaneously with the active orbitals. In addition, the active space has been increased in various ways, treating explicitly also the 'in-plane' ~ system and/or the (; bonding in the heavy atom backbone and/or the nonbonding electrons on the terminal nitrogen and/or the C - - H bonds. In all cases, we employed the full spin space and we did not impose any symmetry requirements on the active orbitals. The description of the out-ofplane x system changes remarkably little with the level of calculation.
7T2
71"1
t I
\x
"
I 11~.-..~'~\\ % I . tt~\',.. I I \ ~ I 1,1~":~ll I I I
H ~'""'
,,:_-'_='=',.,". " "" . . . .
i I
~" N
11
71"3
H C
7T4
HC -.
-.__s
/
Figure 3a: Spin-coupled orbitals for the out-of-plane ~ system of CH2N~. The four spin-coupled orbitals for the out-of-plane ~-electron system of diazomethane are illustrated in Figure 3a as contours in the (;v mirror plane, perpendicular to the molecular plane. Each of these orbitals (~1-n4) takes the form of a deformed 2p~ function, slightly distorted towards one of the neighbouring centres, with one orbital on each of the terminal heavy atoms and two on the
0 H
H
N
N
N
N
L
0 H
H
H
,-.
c H
H
H
H
N
\
-
a):.; I
-
<.-'
:I
H
Figure 3b: Symmetry-unique spin-coupled orbitals C H
C H
for the
CT
bonding framework, the nonbonding
electrons and the in-plane IT system of CH,N,.
546 central N atom. The two orbitals associated with the central N atom have a high overlap (-0.8 in a basis of triple-zeta-valence plus polarization quality) but the associated electron spins are not singlet coupled. Instead, the overwhelmingly dominant mode of spin coupling corresponds to fully formed C - - N and N m N bonds. As is shown in Figure 3b, spin-coupled calculations for the remaining electrons confirm fairly ordinary C--H, C---N and N m N (~ bonds (g,-g~), radiallysplit nonbonding orbitals on the terminal N atom (gl,g2), and an u n r e m a r k a b l e in-plane N - - N ~ bond (~1,r Taking all of these together, the spin-coupled description corresponds to fully formed C=N and N - N multiple bonds. Using a triple-zeta-valence plus polarization basis set, the difference in energy between the RHF and CASSCF wavefunctions (active space of four ~ electrons in four orbitals) is -120kJmo1-1. The analogous spin-coupled wavefunction affords an energy within 1 k J mol-' of this CASSCF description. A key feature of the spin-coupled description of 1,3-dipoles is its simplicity, provided that we are able to reject long-established prejudices against hypercoordinate first-row atoms. Of course, this does not imply that the classical VB pictures are incorrect: if the spin-coupled descriptions were projected onto a basis of classical VB structures, built from strictly localized orbitals, then significant contributions would indeed arise for a number of very different canonical structures. On the other hand, we do feel that the classical VB description is unnecessarily complicated. The very compact spin-coupled description makes it particularly straightforward to see why diazomethane, for example, should feature such short CN and NN bond lengths, as well as a relatively small dipole moment. Furthermore, given that the central N atom is utilizing all five of its valence electrons in bonding, it is not surprising t h a t this atom shows a high resilience to attack by electrophiles or nucleophiles, particularly when compared to the reactivity of its neighbours or of N atoms in other ~-electron systems. Spin-coupled descriptions analogous to that for CH2N 2 have been found to apply to all of the 1,3-dipoles that we have studied, as well as to related systems such as NNO. It is indeed tempting to represent the latter with the structural formula N-N=O, but it must be clearly understood that the singly occupied orbitals involved in a given bond are not strictly localised on individual atoms. We initially believed [23] that a similar mode of description carries over to 03, but we now realise that this system is somewhat more complicated t h a n we had first supposed. Calculations for the four out-of-plane n electron, whether using spin-coupled theory or at the CASSCF level, reveal that there exist two solutions that are remarkably close in energy. The one that lies lowest, provided we optimize properly the description of the inactive electrons, corresponds to a singlet diradical whereas the other corresponds to a hypercoordinate central atom [24]. It is clear that neither description carries much conviction on its own, and t h a t we must consider expanding the active space.
547
3.2. Oxohalides of hypercoordinate nitrogen and phosphorus We indicated in Section 2 our expectation that trifluoroamine N-oxide, F3NO, could be considered the closest first-row analogue of the hypercoordinate molecule PF 5. Indeed, the present authors were pleased to discover from standard inorganic texts not only that F3NO exists, but that it is 'surprisingly resistant' to hydrolysis. Other apparently hypercoordinate nitrogen oxohalides include the nitryl halides FNO 2 and C1NO 2. Spin-coupled calculations on these last two species [10] indicate that the nitrogen atoms use all five valence electrons in covalent bonding, and that the description of the NO 2 group is essentially transferable. We found no evidence for active d orbital participation in the bonding in these molecules or in FPO 2. Some authors have claimed that comparisons of certain geometry and dipole moment data 'provide evidence' for O(p~)-P(d~) back-bonding in X3PO species, and so it proved particularly worthwhile to reassess the various experimental data available for X3NO and X~PO species (X=H, CH3, or F), supplementing these with calculations where necessary. It is useful to recall that CH~ is only a transition state in certain reactions whereas SiH~ is a stable intermediate. Taking account only of electronegativity differences, it does not seem unreasonable that R3PO should feature hypercoordinate bonding, but it would be very surprising if R3NO (R=H,CH 3) did so. All of this is consistent with the relatively large NO separation in H3NO , compared to that in FNO2, and the relatively short PO separation in H3PO , which is similar to that in FPO 2. We found no evidence of supposed O(p~)-P(d~) bonding in spin-coupled calculations on FPO 2. Hypercoordinate bonding in H~PO tends to reduce the extent of (H3P)~+O~charge separation relative to that in H3NO, but this is counterbalanced by the reduced electronegativity of P (relative to N). Two effects approximately cancel. More marked differences appear in the other bonds: the P atom in H3PO accommodates a significant net positive charge, with each H atom approximately neutral, whereas the N atom in H3NO carries a negative charge, with the H atoms positive. It is because of the polarity of the CHa--N and CHa---P bonds that the dipole moment [25] of (CH3)~NO (5.0 D) can exceed that of (CH3)3PO (4.34D), in spite of the differences in bond lengths. Of course, we can expect both F3PO and F3NO to feature hypercoordinate bonding. Indeed, the NO bond length in F~NO is even shorter than that in FNO~ and the NO stretching frequency is much more similar to those found for double bonds than to those observed in amine oxides. Running through the same arguments as for (CH3)~NO and (CH3)aPO, but remembering that both central atoms are now hypercoordinate, we can predict that the dipole moment of F3NO should be smaller than that for F3PO. The experimental values [26] are 0.04D and 1.9 D, respectively. There is certainly no evidence in any of this to support notions of O(p~)-P(d~) back-bonding.
548
4. F U R T H E R EXAMPLES 4.1. O x o f l u o r i d e s of h y p e r c o o r d i n a t e sulfur Spin-coupled theory has been used to investigate the bonding in sulfuryl fluoride, SO2F2, and in the thionyl fluorides, SOF 2 and SOF 4 [27]. After first localizing R H F MOs, we t r e a t e d as active all the electrons involved in the c and 7: bonds, and any nonbonding electrons on sulfur. Analogous calculations were performed for SO 2 and SO3, to enable various comparisons to be made. We also r e p e a t e d our earlier calculations [9] on SF4, a u g m e n t i n g the active space with the nonbonding electrons on sulfur, but we found our two sets of results to be very similar. As we have already indicated in Section 3.2, one of the conventional qualitative models of the bonding between oxygen atoms and second-row atoms, as in molecules such as these, involves the back-donation of electron density from filled O(2p.) orbitals to vacant d, orbitals on the heavy atom. A f u n d a m e n t a l objection to this type of picture is of course the supposed utilization of 3d orbitals as 'valence' orbitals. We found no evidence in any of our work for such O(p.)-S(d~) back-bonding or for active involvement of S(3d) electrons in the (~ bonds. The spin-coupled description of the S=O and S - - F bonding, and of a n y nonbonding electrons on sulfur, t u r n s out to be highly transferable. The sulfur atoms in these systems utilize all six valence electrons in two-centre two-electron polar covalent bonds or in angularly-split lone-pair-like orbitals. For example, using the symbols f for S nonbonding orbitals, 0i for those involved in S - - F bonds, and q; and 7:; for those t h a t describe the S=O bonding, we show in Figure 4 the spin-coupled orbitals for the n o n p l a n a r molecule SOF 2. Orbitals 01 a n d ~2 are shown as contours in the FSF plane, with (~1, (~2, 7:1 and 7:2 in the plane t h a t is both perpendicular to the (~/, mirror and which contains the SO unit. Orbital ~1 is depicted as a representative three-dimensional contour. The r e m a i n i n g spincoupled orbitals in this molecule m a y be obtained from those t h a t are shown by s y m m e t r y operations of the molecular point group. It is clear t h a t the S - - O 7: bonds are significantly more polar t h a n the S O (~ bonds, but it is i m p o r t a n t to keep this in its proper context. A simple Mulliken-like population analysis of such a 7: orbital indicates S:O parentage in much the same ratio as there is S:F character in the S - - F ~ bonds in SF 4. As such, in the sense t h a t there are polar covalent bonds in SF4, we m u s t conclude t h a t there are polar covalent S---O 7: bonds in oxides and oxofluorides of hypercoordinate sulfur. Our calculations on the oxides and oxofluorides were first performed with (~-7: separation in the S=O bonds, as in chemistry's traditional l a n g u a g e for describing multiple bonding. Otherwise, the orbitals were fully optimized w i t h o u t constraints on their form, on the degree of localization, on the overlaps between them, or on the mode of coupling the electron spins. In a f u r t h e r series of calculations, we relaxed also the constraint of (~-7: separation and e x a m i n e d the 'equivalent' or 'bent-bond' descriptions of the SO units in SOF 2 and SOF 4. There were essentially no changes in the descriptions of the S - - F bonds or of the sulfur nonbonding electrons. We find t h a t the 'bent-bond' solutions are energetically
0
$2
F
3
F F
Figure 4.Spin-coupled orbitals for SOF,, as described in the text
7T2
550 preferred over the G-~ separated ones, but only by an extraordinarly small amount of energy. As such, it seems entirely reasonable to use the g-n separated description, should we find it more convenient.
4.2. C h l o r i n e f l u o r i d e s a n d c h l o r i n e o x i d e f l u o r i d e s Spin-coupled theory has recently been applied to the bonding in C1F~ (n=2,4,6) ions, C1F,, (n=3,5) neutrals and C1F,O,, (n=l,3;m=l,2) species[28]. The description of the C l e F bonds changes very little whether or not we treat as active any chlorine nonbonding electrons, and whether we describe C10 units with g-x separated or bent-bond solutions. Each ClmX (~ bond is comprised of a Cl(spX-like)+X(2p) hybrid which overlaps a distorted X(2p) function. Each C1---4) bond is composed of a Cl(3p~)+O(2p~) hybrid overlapping O(2p), and it is much more polar than the corresponding o bond. All in all, there are marked similarities to the spin-coupled descriptions of sulfur systems (Section 4.1), and there is every reason to suppose that much the same mode of description will apply to a very wide range of main group molecules. 4.3. F l u o r o p h o s p h o r a n e s The molecules PFs_,,(CH~),, (n=l-3) all adopt trigonal bipyramidal structures, with a preference for the methyl substituents to occupy equatorial positions. We have found [29] that the spin-coupled description of the p m F bonds in these systems resemble closely that for the parent molecule, PF s. The two spin-coupled orbitals that describe a typical P---CH 3 bond are shown in Figure 5. The closer match of electronegativities results in a phosphorus-based spX-like hybrid that exhibits less delocalization onto the methyl group, and in some delocalization of the methyl-based spX-like hybrid back onto phosphorus. These bonds are clearly much less polar than the P - - F bonds, as we would have anticipated.
a'l
G2 H
C
H
Figure 5" Spin-coupled orbitals for a typical P----CH3 bond in fluorophosphoranes.
551 Our own geometry optimizations confirmed the preference for equatorial methyl groups, but we could discern no significant difference in the spin-coupled description of axial and equatorial P----CH3 bonds. This seems to suggest, at least for these systems, that the geometric preference is more closely linked to steric factors t h a n to details of the bonding.
4.4. YXXY d i h a l i d e s and d i h y d r i d e s of d i o x y g e n and d i s u l f u r The spin-coupled descriptions of the X--Y (~ bonds in FOOF, HOOH, FSSF, C1SSC1 and HSSH [30] parallel closely those for the analogous XY2 species, except, of course, for some reduced (increased) overlaps in those cases with increased (reduced) bond lengths, such as FOOF. In examining the variations in geometry, we found that the most relevant aspect of the bonding is that provided by the various p~-like orbitals. For FOOF, but not HOOH, we observe delocalization of O(2p~)-like orbitals onto the adjacent oxygen atom. This is accompanied by the introduction of significant antibonding character into the opposing O - - F bonds. Similarly, for FSSF (and to a lesser extent C1SSC1) we observe bending of the fairly large S(2p~)-like orbitals towards the adjacent sulfur atom, with the introduction of much less XF antibonding character than in the corresponding case of FOOF. Thus, in FOOF and FSSF, and to a lesser extent C1SSC1, there is incipient hypercoordinate character at oxygen or sulfur, with two partial n-like interactions in approximately perpendicular planes. 5. C O N C L U S I O N S For first-row atoms, the role of d basis functions in ab initio calculations is to act as polarization functions, augmenting the basic s/p basis set. For transition metal elements, on the other hand, these functions provide a description of the valence d orbital character. In the case of second-row elements, the so-called 'expansion of the octet' in hypercoordinate compounds has relatively little to do with the availability of d orbitals, and there appears to be no clear-cut demarcation in the utilization of d functions between normal octet and hypercoordinate species, d functions certainly act as polarization functions for second-row atoms, compensating for the inflexibility of s/p basis sets, albeit to a somewhat greater extent than for first-row atoms, but it is not justified to regard them as valence orbitals. Indeed, the importance of d functions tends to diminish with increasing quality of s/p basis sets. The actual role played by d basis functions in descriptions of PF~ has recently been analyzed using a one-centre expansion technique [11]: although the description of the phosphorus valence region is improved by a function of local d character, the corresponding d population is only weakly bound to phosphorus and it should not be considered as chemically bonding. Being firmly based in modern valence bond theory, the very compact descriptions of the bonding which have emerged from our many spin-coupled calculations facilitate a direct and, hopefully, very convincing description of
552 hypercoordinate bonding without invoking active d orbital participation. We assert the democracy principle: almost all valence electrons can participate in chemical bonding if provided with sufficient energetic incentives. We find this simple idea, as well as simple considerations of atomic size and electronegativity, to be of much greater utility than the octet rule when describing the bonding in halides, oxides and oxohalides of second-row atoms, and in rationalizing the existence of noble gas compounds. We may use the same general language when trying to understand why SiH~ is a stable intermediate in certain reactions whereas CH~ is only a transition state. Analogous arguments can be used to describe the bonding in 1,3-dipolar molecules containing first-row atoms, such as diazomethane, and in oxohalides of hypercoordinate nitrogen. The bonds involving hypercoordinate atoms tend to be highly polar. There are no significant qualitative differences between the hypercoordinate nature of first-row, second-row and noble gas atoms in appropriate chemical environments, nor between the descriptions of the bonding in hypercoordinate and so-called 'normal octet' molecules, except for some differences in bond polarity. REFERENCES 1. 2. 3. 4. 5. 6. 7. ~
9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
P.J. Hay, J. Am. Chem. Soc. 99 (1977) 1003. (a) H. Wallmeier and W. Kutzelnigg, J. Am. Chem. Soc. 101 (1979) 2804. (b) W. Kutzelnigg, Angew. Chem., Int. Ed. Engl. 23 (1984) 272. E. Magnusson and H.F. Schaefer, J. Chem. Phys. 83 (1985) 5721. A.E. Reed and F. Weinhold, J. Am. Chem. Soc. 108 (1986) 3586. (a) C.H. Patterson and R.P. Messmer, J. Am. Chem. Soc. 111 (1989) 8059. (b) ibid 112 (1990) 4138. A.E. Reed and P.v.R. Schleyer, J. Am. Chem. Soc. 112 (1990) 1434. (a) E. Magnusson, J. Am. Chem. Soc. 112 (1990) 7940. (b) ibid 115 (1993) 1051. R.P. Messmer, J. Am. Chem. Soc. 113 (1991) 433. D.L. Cooper, T.P. Cunningham, J. Gerratt, P.B. Karadakov, and M. Raimondi, J. Am. Chem. Soc. 116 (1994) 4414. T.P. Cunningham, D.L. Cooper, J. Gerratt, P.B. Karadakov, and M. Raimondi, Int. J. Quant. Chem. 60 (1996) 393. M. H~iser, J. Am. Chem. Soc. 118 (1996) 7311. J. Pipek and P.G. Mezey, J. Chem. Phys. 90 (1989) 4916. D.J. Hajdasz and R.R. Squires, J. Am. Chem. Soc. 108 (1989) 3139. G. Sini, G. Ohanessian, P.C. Hiberty, and S.S. Shaik, J. Am. Chem. Soc. 112 (1990) 1409, and references therein. A. Padwa, ed., 1,3-Dipolar Cycloaddition Chemistry, Vols. 1 & 2, Wiley, New York, 1984. S.P. Walch and W.A. Goddard III, J. Am. Chem. Soc. 97 (1975) 5319. P.C. Hiberty and C. Leforestier, J. Am. Chem. Soc. 100 (1978) 2012. P.C. Hiberty and G. Ohanessian, J. Am. Chem. Soc. 104 (1982) 66.
553 19. 20. 21. 22. 23. 24. 25.
26.
27. 28. 29. 30.
R.D. Harcourt and W. Roso, Can. J. Chem. 56 (1978) 1093. L. Pauling, The Nature of the Chemical Bond, 3rd. Edition, p187. R.D. Harcourt, Lecture notes in Chemistry. Vol. 30, Springer-Verlag, 1982. D.L. Cooper, J. Gerratt, M. Raimondi, and S.C. Wright, Chem. Phys. Lett. 138 (1987) 296. D.L. Cooper, J. Gerratt, and M. Raimondi J Chem Soc Perkin Trans 2 (1989) 1187. T. Thorsteinsson, D.L. Cooper, J. Gerratt, P.B. Karadakov, and M. Raimondi, Theor Chim Acta 93 (1996) 343. (a) N. Hacket and R.J.W. Le F~vre, J. Chem. Soc. (1961) 1612. (b) R.S. Armstrong, M.J. Aroney, R.J.W. Le F~vre, R.K. Pierens, J.D. Saxby, and C.J. Wilkins, J. Chem. Soc. A (1969) 2735. (a) W.H. Kirchoff and D.R. Lide Jr, J. Chem. Phys. 51 (1969) 467. (b) R.H. Kagann, I. Ozier, and M.C.L. Gerry, J. Mol. Spectrosc. 71 (1978) 281. T.P. Cunningham, D.L. Cooper, J. Gerratt, P.B. Karadakov, and M. Raimondi, to be published. D.L. Cooper, T. Bregeron, J. Gerratt, P.B. Karadakov, and M. Raimondi, to be published. D.L. Cooper, P. Butz, J. Gerratt, P.B. Karadakov, and M. Raimondi, to be published. D.R. Alleres, D.L. Cooper, T.P. Cunningham, J. Gerratt, P.B. Karadakov, and M. Raimondi, J. Chem. Soc., Faraday Trans. 91 (1995) 3357.
This Page Intentionally Left Blank
Z.B. Maksid and W.J. Orville-Thomas (Editors)
555
Pauling's Legacy: Modem Modelling of the Chemical Bond
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
The electronic structure o f transition metal c o m p o u n d s G. Frenking, C. Boehme and U. Pidun Fachbereich Marburg
Chemie,
Philipps-Universit~it Marburg,
Hans-Meerwein-StraBe,
D-35032
1. INTRODUCTION Linus Pauling was a pioneer of chemical bonding theory and he is recognized as one of the architects of modem chemistry. The main characteristic of his scientific approach was the reconciliation of empirical knowledge with quantum theory: He was not only interested in the calculation of the structures and properties of molecules, but also in their interpretation in a fundamental but simple and transparent way. He was the first to realize that molecular and electronic structure are the central subjects of chemistry, and he was the first to systematically investigate the nature of the chemical bond. 1 Linus Pauling introduced a number of valuable theoretical chemical models, like the concepts of hybridization and electronegativity, which are now well-established tools for the discussion of chemical phenomena. More importantly, he initiated a new way of looking at molecules and chemical reactions, building a bridge between chemistry and physics. Pauling always favored the Valence Bond (VB) theory over the Molecular Orbital (MO) theory for the description of the electronic structure of molecules, because the VB model resembles more the pre-quantum theoretical models of chemical bonding. However, modem quantum chemistry is dominated by MO theory, which has clearly prevailed in the computational applications. Nevertheless, a number of terms and concepts of VB theory still play an important role when it comes to the interpretation of the results of a quantum chemical calculation. During the last two decades, computational chemistry has made a breath-taking development, introducing new methods for calculating molecules of light and heavy atoms. It is now possible to predict theoretically the structures of molecules and the energetics of chemical reactions with an accuracy challenging for the experimentalists. Less emphasis has been placed on the progress of the bond theoretical interpretation of the quantum chemical results, although Pauling's successors developed several new methods to transform the ever more sophisticated quantum mechanical wave functions into chemical models. Prominent examples are the Natural Bond Orbital (NBO) analysis of Weinhold and coworkers, 2 Bader's topological analysis of the electron density distribution3 and various schemes for the interpretation of donor-acceptor interactions between two fragments. 4 ,s ,6,7 The understanding of the bonding in transition metal compounds presents a special challenge for theoretical chemistry. Just as in the case of the main group elements, 8 it could be shown that a classification of the bonds between a transition metal and a ligand into
556 covalent and donor-acceptor bonds is not only possible, but also very useful. 9 The charge decomposition analysis (CDA)7 can be used as a theoretical indicator to distinguish between the two types of bonding. It provides a quantitative expression of the familiar Dewar-ChattDuncanson model ~~ for the description of donor-acceptor interactions, while covalent transition metal-ligand bonds are better discussed using the hybridization picture. However, principles and rules derived for main group chemistry, like for example Bent's rule 11, have to be modified when transition metals are involved. 12 The problem of the interdependence between sdX-hybridization and molecular structure ~3 is still not completely understood. In this chapter, we want to address a particular problem of transition metal chemistry: the bonding between a transition metal and a main group metal. Very recently, transition metal carbonyl complexes of A1l- and Gal-fragments, e.g. [(CO)sCr-Ga(C2Hs)L2] (L2 = tmeda), 14 [(CO)4Fe-A1Cp*] 15 and [(CO)sW-AIC1L2] ( L 2 = tmeda), 16 could be synthesized and characterized by X-ray analysis. The position of the v(CO)-IR-bands was taken as indication against the extreme description of the complexes as contact ion pairs, e.g. [Cp*A1]2+[Fe(CO)4] 2. The overall structures suggest an interpretation of the interaction between the transition metal and the main group metal as a donor-acceptor bond. In order to clarify the bonding situation we performed quantum chemical ab initio calculations of the model complexes [(CO)sW-A1CI(NH3)2] and [(CO)sW-A1C1] and of the two series [(CO)sWXCI(NH3)2] (X = B, A1, Ga, In, T1) and [(CO)5W-Y] (Y = [ N a ( N H 3 ) 3 ] , [ M g ( N H 3 ) 3 ] , [A1CI(NH3)2], [SiC12(NH3)]). We optimized the geometries of the complexes, calculated bond dissociation energies and analyzed the electronic structure using the NBO method 2 and the charge decomposition analysis (CDA) 7 in order to find out how classical bond theoretical concepts can be applied to this new class of compounds.
2. COMPUTATIONAL DETAILS The geometry optimizations have been carried out at the MP2 level o f t h e o r y 17 using effective core potentials (ECPs)I8 for the heavier elements. Hydrogen and the first and second row elements B, C, N, O, Na, Mg, A1 and Si were described by standard all electron 6-31G(d) basis sets. 19 For tungsten we used the relativistic ECP developed by Hay and Wadt and the corresponding (441/2111/21) split-valence basis set. 2~ A pseudopotential with a (31/31/1) valence basis set was used for Cl, Ga, In and Tl. 21 This basis set combination is our standard basis set II. 22 To check if the calculated structures are true minima on the potential energy surface, we first optimized the geometries of [(CO)sW-A1C1] and of the series [(CO)sW-Y] (Y = [Na(NH3)3]-, [Mg(NH3)3], [A1CI(NH3)2], [SiCI2(NH3)]) at the HF/II level of theory, followed by numerical calculation of the corresponding Hessian matrices. All structures were found to be minima, indicated by only positive eigenvalues of the Hessian. Starting from the HF structures we further optimized the geometries at MP2/II. Numerical frequency calculations are not feasible at this level. However, the structural differences between the HF/II and the MP2/II geometries are only minor, and we can thus expect the MP2/II structures to be minima as well. For the series [ ( C O ) 5 W - X C I ( N H 3 ) 2 ] ( X - B, Ga, In, T1) we only performed MP2/II optimizations without frequency calculations. The close similarity of the obtained structures
557 and the corresponding structure of [(CO)sW-A1CI(NH3)2] makes it highly probable that the remaining complexes of the group 13 metals are also true minima on the potential energy surface. The bonding situation in the complexes was analyzed using the natural bond orbital (NBO) method developed by Weinhold and coworkers. 2 The metal-ligand donor-acceptor interactions were investigated with the newly developed charge decomposition analysis (CDA). 7 In the CDA method the (canonical or natural) molecular orbitals of the complex are expressed in terms of the MOs of appropriately chosen fragments. For the analysis of the W-L interactions, the natural orbitals (NOs) of the MP2/II wavefunction of (CO)5W-L are constructed by a linear combination of the MOs of the fragments (CO)sW and L in their geometries in (CO)sWL. Charge donation d from L to (CO)sW is then given by the interaction between the occupied orbitals at L and the empty orbitals at (CO)sW. Similarly, the backdonation b from (CO)sW to L is given by the interaction between filled orbitals at (CO)sW and empty orbitals at the ligand L. Finally, the repulsive polarization r results from the interaction between the occupied orbitals of both fragments. The residual term A, which is due to the interaction between empty orbitals of the two fragments, should be virtually zero for a true donoracceptor complex. It could be shown that a non-vanishing residual term is indicative for compounds, for which the description of the bonding in the framework of the Dewar-ChattDuncanson model is not appropriate. 9
k
n
1
m
i
-
i
miC, Cm( ,l m)
-
~ k
_
l l l
Z .C mC:l ml l m
n
A = L; B = (CO)sW m = occupation number c = fragment orbital coefficient q) = fragment MO
558 The sum of the orbital contributions di, b i and r i yields the total amount of donation (d), backdonation (b) and repulsion (r), respectively. The CDA calculations have been performed using the program CDA 2.1. 23 All other calculations have been carded out with the program package Gaussian 94. 24
3. RESULTS AND DISCUSSION We start our discussion of bonding between transition metals and main group metals with a thorough investigation of the model systems [(CO)sW-A1CI(NH3)2] and [(CO)sW-A1C1]. The first system closely resembles the experimentally found structure [(CO)sW-A1C1L2] (L2--tmeda). 16 The second system has the bases removed from the main group metal A1 and therefore will help to understand the influence of these bases on the W-A1 bond. Besides aluminum also gallium and indium have been experimentally used as coordinating ligands. The next step therefore is to compare the different group 13 metals in the series [(CO)sWXCI(NH3)2] (X = B, A1, Ga, In, T1). Finally, we take into account similar structures with metal ligands from other main groups by analyzing the series [(CO)sW-Y] (Y = [Na(NH3)3], [Mg(NH3)3], [A1CI(NH3)2], [SiC12(NH3)]). The chosen molecules make it possible to compare the complexes [(CO)sW-XCln(NH3)m]x" for elements along a group (B - TI) or a row (Na - Si) of the periodic system.
3.1. Chemical Bonding in [(CO)sW-AICI(NH3)2] and [(CO)sW-AICI] The MP2 optimized geometries of [(CO)sW-AICI(NH3)2] and [(CO)sW-AIC1] and the free ligands [A1CI(NH3)2] and A1C1 are shown in figure 1. The W-A1 bond length in [(CO)sWA1CI(NH3)2] amounts to 2.575A which is in reasonable agreement with the experimental value of 2.646A in [(CO)W-AIC1L2] (L2--tmeda). 16 The A1-C1 bond length is 2.167A compared to the experimental result of 2.200A. In the base free complex [(CO)5W-A1C1] both the W-A1 bond (2.481A) and the A1-C1 bond (2.088A) are clearly shorter. At the same time, the ammonia bases strongly influence the geometry of the W(CO)5 unit. In [(CO)sW-A1C1] the Al-W-(CO)cis angle is about 90 ~ but in [(CO)5W-A1CI(NH3)2] the cis standing CO groups are bent towards the aluminum. The AI-W-(CO)cis angle decreases to 81.1 o for the CO groups standing eclipsed to the NH3 groups and to 85.7 ~ for the CO groups staggered to the C1 atom. This umbrella effect is also observed in the experimental structures. Figure 1 further shows that the stronger bent CO groups also have W-(CO) bond lengths smaller by about 0.05A than the less bent CO groups. The geometries of the free ligands differ considerably from the geometries of the bonded ligands. The A1-C1 bond gets shorter by 0.052A in the case of [A1CI(NH3)2] and by 0.046A in the case of A1C1 upon binding to tungsten (figure 1). The A1-N bonds in [A1CI(NH3)2] also get shorter upon binding to W(CO)5, decreasing from 2.266A to 2.076A.
559
Figure 1. MP2 optimized geometries of [(CO)sW-A1CI(NH3)2], [(CO)sW-A1C1], [A1CI(NH3)2], and A1C1.
560 Table 1. Calculated Dissociation Energies D e (kcal/mol) at MP2/II Molecule (CO)sWAICI(NH3) 2
Symm. Cs
(CO)sWA1C1 (CO)sWBCI(NH3)2 (CO)sWGaCI(NH3)2 (CO)sWInCI(NH3)2 (CO)sWT1CI(NH3)2 (CO)sWSiC12(NH3) (CO)sWMg(NH3) 3 [(CO)sWNa(NH3)3]
C4v Cs Cs Cs Cs Cs Cs Cs
Bond W-A1 A1-N W-A1 W-B W-Ga W-In W-T1 W-Si W-Mg W-Na
De
93.1 65.2 a 58.4 119.6 70.9 70.5 47.8 74.8 114.0 126.4
a Dissociation energy of two NH 3 ligands.
The dissociation energy of the W-A1 bond is 93.1 kcal/mol in [(CO)sW-A1CI(NH3)2] and 58.4 kcal/mol in [(CO)sW-A1C1] (table 1). This means that adding the ammonia bases leads to a very significant bond strengthening of about 35 kcal/mol. The energy for the dissociation of the two NH 3 groups from [(CO)sW-A1CI(NH3)2] is calculated to be 65.2 kcal/mol.
Table 2. Results of the NBO Analysis at MP2/II. Partial Charges q, Wiberg Bond Indices P, Hybridization of the Lone-pair Donor Orbital at Atom X a given by the s-Contribution %s Molecule
q(W)
q(W(CO)5 )
q(X)a
p(W-X)a
%s
(CO)sWA1CI(NH3) 2 (CO)sWA1C1 A1CI(NH3) 2 A1C1 (CO)sWBCI(NH3) 2 (CO)sWGaCI(NH3) 2 (CO)5WInCI(NH3) 2 (CO)sWT1CI(NH3) 2 (CO)sWSiC12(NH3) (CO)sWMg(NH3) 3 [(CO)sWNa-(NH3)3]-
-0.72 -0.94
-0.90 -0.58
0.40 0.59
-0.53 -0.73 -0.71 -0.70 -1.13 -0.90 -0.63
-0.63 -0.72 -0.77 -0.61 -0.55 - 1.65 -1.86
+1.24 +1.13 +0.54 +0.68 +0.19 +1.07 +1.18 + 1.08 +1.27 + 1.59 +0.85
23.9 64.1 84.0 93.8 35.2 18.3 19.8 36.2 30.2 98.3 99.8
ax = A1, Ga, B, T1, respectively.
0.42 0.44 0.42 0.42 0.57 0.13 0.03
561 In order to use Pauling's model of orbital hybridization for understanding the W-A1 bond in [(CO)sW-A1C1] a good option is to use a buildup procedure starting from A1C1. In A1C1 the aluminum atom has the formal oxidation state I. We therefore have a free electron pair which occupies an orbital with 93.8% s-character according to the NBO analysis (table 2). Upon bonding to [(CO)sW] a donor bond is formed. The s-character of the lone pair orbital is reduced to 64.1%. This is understandable, since bonding to the tungsten atom requires a directed orbital which is obtained by increasing its p-character. The charge decomposition analysis (CDA) shows that the donation/backdonation ratio in the resulting complex is about 1.2 (table 3) which means that the complex is slightly dominated by ligand---~metal donation. The shift of charge from the aluminum to the tungsten atom is shown by the rise of the NPA charge on A1 from +0.68 in the free ligand to +1.13 in the complex (table 2). The resulting loss of electron density on A1 explains the observed shortening of the A1-C1 bond upon coordination. Now we investigate the role of the ammonia bases. If two NH3 bases are added to the complex [(CO)5W-A1C1] yielding [(CO)sW-A1CI(NH3)2] the W-A1 bond energy increases (table 1), while at the same time the W-A1 bond gets longer (figure 1). How can this seemingly paradox result be explained? Again, the key to the answer is to study the hybridization at the A1 atom. Upon addition of the bases, the s-character of the free electron pair of the coordinated aluminum further drops from 64.1% to 23.9%. By this, the directionality of the orbital and therefore the overlap with the accepting orbitals on tungsten is improved resulting in the observed higher bond energy. On the other hand, the p-orbitals are larger than the s-orbital which means that even though the W-A1 bond is stronger than in the base free complex, it still must be longer. The increased donor ability of the aluminum ligand can also be seen in the NPA charge of the coordinated A1 (+1.24 compared to +1.13 without bases) and the CDA donation/backdonation ratio which grows from 1.2 to 1.3 (table 3). Similar to the base free A1C1 ligand, the charge transfer from the ligand to the tungsten atom upon coordination results in the mentioned shortening of the A1-C1 and A1-N bonds (figure 1). Table 3. CDA Results at MP2/II for the Metal-Ligand Donor-Acceptor Bonds. Ligandto-Metal Donation M~---L, Backdonation M--+L, Repulsive Polarization M~-~L, Residual Term A and Donation/Backdonation ratio d/b Molecule
Bond
(CO)5WA1CI(NH3) 2 (CO)sWA1C1 (CO)sWBCI(NH3) 2 (CO)sWGaCI(NH3) 2 (CO)sWInCI(NH3) 2 (CO)sWT1CI(NH3) 2 (CO)sWSiC12(NH3) (CO)sWMg(NH3) 3 [(CO)sWNa(NH3)3]-
W-A1 W-A1 W-B W-Ga W-In W-T1 W-Si W-Mg W-Na
Donation M<--L 0.356 0.370 0.196 0.434 0.449 0.411 0.125 -0.026 -0.136
Backdonation Repulsion Residual M--~L M<--~L A 0.280 -0.290 0.089 0.301 -0.260 0.027 0.095 -0.428 -0.051 0.213 -0.272 0.035 0.207 -0.265 0.043 0.114 -0.189 -0.006 0.118 -0.370 -0.004 0.370 -0.189 0.173 0.338 -0.056 0.051
d/b 1.3 1.2 2.1 2.0 2.2 3.6 1.1 -
562 While our theoretical results offer a sound interpretation of the A1-W donor-acceptor bond, the umbrella effect is more difficult to explain. As it is not observed in the base free case, agostic interactions between the CO and NH 3 groups seem to be a befitting solution to the problem. However, figure 2 shows the geometry of [(CO)sW-H]. In this compound there are no possible agostic interactions but still a clear umbrella effect is found. Then, why is it not observed with A1C1 as ligand? In AIC1 the empty p-orbitals of A1 are well suited for accepting backdonation of electron density from tungsten. In order to achieve optimal overlap, the donor and the acceptor orbitals must have a parallel orientation, which is fulfilled for the base free complex. H- and [AICI(NH3)2], on the other hand, do not have free orbitals to accept backdonation: In the case of H there are no orbitals available, and for [A1CI(NH3)2] the porbitals on A1 are at least partly filled by the additional NH3 groups. Therefore, backbonding is the factor that h i n d e r s the umbrella effect in [(CO)5W-A1C1]. The question what c a u s e s this effect in the other compounds still remains open.
Figure 2. MP2 optimized geometry of [W(CO)sH]-.
3.2. The Series [(CO)5W-XCI(NH3)2] (X = B, AI, Ga, In, TI) Now we compare the aluminum model system [(CO)sW-A1CI(NH3)2] with its homologues [(CO)sW-XCI(NH3)2] (X = B, Ga, In, T1). The optimized geometries are shown in figure 3. As expected, the increasing atomic radii of the elements X result in an increasing W-X bond length. It rises from 2.349A for the W-B bond to 2.800A for the W-T1 bond. The value of 2.586A for the W-Ga bond is shorter than the experimental value of 2.70A in [(CO)sWGaEt(tmeda)]. 16 However, it is known from similar chromium complexes that the exchange of Et for C1 leads to longer M-Ga bonds. 16 The W-X bond length exhibits only a small change when going from A1 to Ga and from In to T1. The reason for this is the first filling of a d-shell before Ga and the first filling of an f-shell before T1 (lanthanoid contraction). In the [(CO)sW]
563 fragment the most apparent change in the series is the decreasing length of the W-(CO)trans bond which reduces from 2.051A in the boron case to 2.004A in the thallium case. This decrease in the bond length to the trans CO group upon going down group 13 correlates nicely with the decrease in bond strength to the corresponding [XCI(NH3)2] ligand. The MP2 bond energies of the W-X bonds (table 1) reduce from 119.6 kcal/mol for X=B to 47.8 kcal/mol for X=T1. There seems to be some competition between X and the trans CO group for accepting electron density from filled orbitals of tungsten. The differences in the bond energies are rather large with the notable exception of the Ga and In compounds which have approximately the same W-X bond energy (70.9 kcal/mol and 70.5 kcal/mol, respectively). It should be noted that, even though boron forms the strongest bond to tungsten, the boron compound is probably not the chemically most stable compound in the series, because boron has the highest tendency to be oxidized to the stable oxidation state III.
Figure 3. MP2 optimized geometries of the series [(CO)sW-XCI(NH3)2] (X = B, Ga, In, T1).
564
Figure 3 (contd.). The hybridization of the donor electron pair of the group 13 elements does not show a smooth trend from boron to thallium (table 2). The boron lone pair in the complex has a comparatively large s-character (35.2%). Upon going down the group, the s-character of the donor lone pair decreases. This effect can be traced back to the increasing difference in the spatial extension of the s- and p-orbitals, which makes sp-hybridization less favorable. 25 Thallium, however, again has a lone pair with high s-character. One should note that for thallium the oxidation state I is more stable than the oxidation state III. The reason for this is the comparatively large energy gap between its s- and p-orbitals. The ,,inert s-electron pair" is a result of relativistic effects which cause a significant contraction of the s-shells of the 6 th period elements. 26 Therefore, the promotion of electron density from the s- to the p-valence space due to the influence of the ammonia bases is not as favorable as for the other metals.
565
The resulting less effective overlap with the corresponding tungsten orbitals leads to the weakest W-X bond in the series. The donation/backdonation ratio as calculated with the CDA (table 3) does not show a trend from boron to thallium. Boron, gallium and indium have similar ratios of about 2 which identifies them as strong donors with moderate acceptor capacities. For aluminum this ratio drops to 1.3 which means that in the case of [A1CI(NH3)2] the donating and backdonating capabilities are roughly the same. Thallium on the other hand shows a ratio of 3.6 and therefore is only a weak acceptor ligand. The reason for this could be the long W-T1 bond which probably prohibits good overlap between suitable orbitals of the [(CO)sW] and [T1CI(NH3)2] fragments.
3.3 The Series [(CO)sW-YI (Y = [SiCI2(NH3)], [AICI(NH3)2], [Mg(NH3)3], [Na(NH3)3I-) In the final part of our discussion we compare the properties of ligands in which elements from different main groups serve as donor atoms. The optimized geometries of the complexes [(CO)sW-Y ] (Y = [SiC12(NH3)], [Mg(NH3)3] , [Na(NH3)3]-) are shown in figure 4. The W-X (X = Si, A1, Mg, Na) bond lengths range from 2.463A for X = Si to 3.147A for X = Na. The largest increase is observed upon going from X - Mg (2.699A) to X = Na. The sodium compound is the only anionic complex discussed here. As the sodium anion is very bulky, the long W-Na bond should be expected. The umbrella effect gets stronger from X = Si to X Na. If one takes the smallest C-W-X angle as a measure, the bending of the cis standing CO groups towards the ligand changes only slightly from X = Si (C-W-Si - 83.7 ~ to X = A1 (CW-A1 - 81.1~ In the magnesium and sodium compounds bending gets very strong, the smallest C-W-X angles decrease to only 72.1 ~ and 62.8 ~ respectively. Note, however, that in both cases the largest C-W-X angle still exceeds 80 ~ All structures have in common that for the cis standing CO groups a small C-W-X bond angle corresponds to a short C-W bond. The bond length to the trans standing CO group decreases from 2.052A for X - Si to 2.002A for X =Na. The decreasing length of the W-(CO)transbond is correlated with a decreasing energy of the W-X bond in the series [(CO)sW-XCI(NH3)2] (X = B, A1, Ga, In, TI). In the [(CO)5W-Y] series, however, it is correlated to an increasing W-X bond energy (table I). The dissociation energy of the W-X bond raises steadily from 74.8 kcal/mol for [(CO)sW-SiC12(NH3)] to 126.4 kcal/mol for [(CO)sW-Na(NH3)3]-, even though the corresponding W-X bond length increases. In order to understand this surprising behavior we investigated the covalent W-X bond order as given by the Wiberg bond index (table 2). It does not change very much for the group 13 ligands, but for the [(CO)sW-Y ] series it decreases from 0.57 for Y = [SiC12(NH3)] to 0.03 for Y = [Na(NH3)3]. Obviously, the W-(CO)trans bond length in the [(CO)sW-Y] series is correlated to the covalency of the W-X bond. This is in agreement with the conclusion made before that there is a competition between the trans standing CO group and the metal ligand for electron density from tungsten: Higher covalency of the W-X bond results in stronger concentration of this density in the direction of the ligand X and therefore leads to longer W(CO)trans bonds.
566
Figure 4. MP2 optimized geometries of the series [(CO)sW-Y] (Y = [SiC12(NH3)], [Mg(NH3)3], [Na(NH3)3]).
567 It is striking that upon going from Y = [A1CI(NH3)2] to Y = [Mg(NH3)3] the Wiberg bond index for the W-X bond drops sharply from 0.40 to 0.13. At the same time, the negative charge of the W(CO)5 group raises from -0.90 to -1.65 (table 2). The reason for these drastic changes becomes clearer if one takes into account the results of the CDA (table 3). For the magnesium complex a negative donation is obtained. Furthermore, the residual term amounts to 0.173 in this complex. This term results from the interaction between unoccupied orbitals of the two fragments, and therefore should be virtually zero for a true donor-acceptor complex. Both the negative donation and the high residual term are not physically meaningful. A similar breakdown of the CDA has been observed in complexes which have covalent instead of donor-acceptor bonds. 9'27 It was concluded that, as soon as the DewarChatt-Duncanson model becomes unsuitable for the description of a compound, the CDA fails to deliver a coherent picture. In this way, the CDA can be used as a theoretical indicator for the detection of donor-acceptor complexes. This leads to the conclusion that in the case discussed here, the CDA results indicate another deviation from the donor-acceptor model: The complexes [(CO)sW-Mg(NH3)3] and [(CO)sW-Na(NH3)3] are better understood as contact ion pairs [(CO)sW]2-[Mg(NH3)3] 2+ and [(CO)sW]2[Na(NH3)3] +, in agreement with the NBO results discussed above. To the left of A1, the complexes become ionic compounds. But what happens to the right, with [SiC12(NH3)] as a ligand? According to the NBO analysis, the coordinated Si has a lone pair with 30.2% s-character which is higher than in the case of A1 (23.9%, table 2). This is in line with the smaller amount of charge transferred to the W(CO)5 fragment, which has a partial charge of-0.55 in the case of the silicon ligand and of-0.90 for the aluminum ligand. The CDA gives a donation/backdonation ratio of 1.1 which is slightly lower than the value of 1.2 obtained for the A1 ligand (table 3). Overall, [SiC12(NH3)] seems to be very similar to [A1CI(NH3)2] as a ligand. Although it is less strongly bonded to tungsten, it still is an interesting aim for synthesis. Do the results obtained for the [(CO)sW-Y] series help us to understand the observed umbrella effect? The ionic nature of the magnesium and sodium compounds offers a new approach to the problem. Consider the free [W(CO)5] 2 anion. As one would expect, it has the shape of a trigonal bipyramid. 28 Now consider the structures of [(CO)sW]2[Mg(NH3)3] 2+ and [(CO)5w]E-[Na(NH3)3] + (figure 4). Both can be viewed as [W(CO)5] 2" anions which are distorted by the approaching cations, [Mg(NH3)3]2+ and [Na(NH3)3]+, respectively. The strong bending of two of the CO groups towards the ligand therefore can be understood as the retaining force of the bipyramidal structure in which these groups hold equatorial positions. It remains the question why the axial CO groups slightly bend towards the ligand, too. A possible explanation is steric pressure from the trans standing CO group. CO has a sterically demanding n-system and is closer to the tungsten atom than the metal ligands. This also explains why the axial bending is stronger in the sodium compound: Because the W-Na distance is larger than the W-Mg distance the sodium ligand needs less space. This completes the explanation of the umbrella effect for the ionic compounds. The donor-acceptor complexes of the group 13 and silicon ligands are more complicated. The neutral molecule W(CO)5 has the shape of a quadratic pyramid. 27Ca)As charge is added, one would expect that it slowly approaches the bipyramidal shape of [W(CO)5] 2". Albeit, [W(CO)sH]- (figure 2) is not a distorted trigonal bipyramid but a distorted octaeder. So obviously the charge transfer to the
568 W(CO)5 fragment at first simply makes its electronic structure less rigid until at a certain amount the bipyramidal form becomes dominant. This lowered rigidity allows the cis standing CO groups to evade the steric effect of the trans standing CO group by bending towards the ligand. Therefore we suggest that the driving force behind the umbrella effect is the charge that is transferred from the ligand to the tungsten atom.
4. SUMMARY AND CONCLUSIONS The present study of the compounds [(CO)sW-XCln(NH3)m] x" gives a thorough description of the bonding between a main group metal and a transition metal for a wide range of main group elements. The MP2 optimized geometry of our model system [(CO)sW-A1CI(NH3)2] is in good agreement with the X-ray structure of the recently synthesized [(CO)sW-A1C1L2] (L2 = tmeda). The analysis of the electronic structure using the NBO analysis and the newly developed CDA clearly shows that the free electron pair of aluminum forms a donor bond to tungsten and therefore [(CO)5W-A1CI(NH3)2] is best understood as a donor-acceptor complex. By investigating the base free complex [(CO)sW-A1C1] we could show that the NH 3 ligands enhance the donor capability of the aluminum atom leading to a substantially higher bond dissociation energy. This is accompanied by a marked decrease of the s-character of the donor electron pair of aluminum in [(CO)sW-A1CI(NH3)2]. The comparison of the group 13 elements in the series [(CO)5W-XCI(NH3)2] (X = B, A1, Ga, In, T1) shows that the bonding in these molecules is very similar. All compounds should be described as donor-acceptor complexes. The lengthening of the W-X bond upon going from X = B to X = T1 and the corresponding reduction of the bond dissociation energy are explained by a decrease of the donor capability of the main group ligand. In the series [(CO)sW-Y] (Y = [SiC12(NH3)], [A1CI(NH3)2], [Mg(NH3)3], [Na(NH3)3]) of complexes along the second full row of the periodic system a change in the type of bonding is observed. Like the aluminum compound, [(CO)sW-SiC12(NH3)] is best understood as a donoracceptor complex. For [(CO)sW-Mg(NH3)3] and [(CO)sW-Na(NH3)3], on the other hand, the CDA results, the calculated NPA charges and the Wiberg bond indices clearly indicate an ionic bonding, corresponding to the interaction between [W(CO)5] 2- and [Mg(NH3)3] 2+ and [Na(NH3)3] +, respectively. This ionic bonding can be viewed as the borderline case of donoracceptor interaction culminating in the complete transfer of the bonding electron pair. The bond dissociation energies accordingly increase from X = Si to X = Na. In the experimental as well as in the calculated structures [(CO)sW-XCln(NH3)m] x- a bending of the cis standing CO groups towards the main group metal ligand is observed. This umbrella effect cannot be explained by agostic interactions between the CO and NH 3 groups because it is also observed for [(CO)sW-H]. As the bending is particularly strong for the ionic complexes [(CO)sW-Mg(NH3)3] and [(CO)5W-Na(NH3)3]-, the effect is traced back to the charge transfer from the ligand to the tungsten atom. The surprising observation that no bending occurs for [(CO)sW-A1C1] is a result of the strong backdonation into the empty porbitals of aluminum, which is less effective for the base coordinated ligands under investigation.
569 ACKNOWLEDGEMENTS
We want to thank professor R. A. Fischer (Heidelberg) for informing us about his experimental results prior to publication and for helpful discussions. This work was financially supported by the F onds der Chemischen Industrie and the Deutsche Forschungsgemeinschaft (SFB 260 and Graduiertenkolleg Metallorganische Chemie). U. P. thanks the Fonds der Chemischen Industrie for a doctoral scholarship. Computer time and excellent service were given by the HRZ Marburg, HHLRZ Darmstadt and HLRZ Jt~lich.
REFERENCES
I L. Pauling, The Nature of the Chemical Bond and the Structure of Molecules and Crystals: An Introduction to Modem Structural Chemistry, Comell University Press, Ithaca, New York, 1939. 2 A.E. Reed, L. A. Curtiss and F. Weinhold, Chem. Rev., 88 (1988) 899. 3 (a) R. F. W. Bader, Atoms in Molecules, Clarendon Press, Oxford, 1990. (b) R. F. W. Bader, Chem. Rev., 91 (1991) 893. 4 (a) K. Morokuma, J. Chem. Phys., 55 (1971) 1236. (b) K. Kitaura and K. Morokuma, Int. J. Quantum Chem., 10 (1976) 325. (c) K. Morokuma, Acc. Chem. Res., 10 (1977) 249. 5 (a) P. S. Bagus, K. Hermann and C. W. Bauschlicher, J. Chem. Phys., 80 (1984) 4378. (b) P. S. Bagus, K. Hermann and C. W. Bauschlicher, J. Chem. Phys., 81 (1984) 1966. (c) P. S. Bagus and F. Illas, J. Chem. Phys., 96 (1992) 8962. 6 G. Blyholder and M. Lawless, J. Am. Chem. Sot., 114 (1992) 5828. 7 S. Dapprich and G. Frenking, J. Phys. Chem., 99 (1995) 9352. 8 A. Haaland, Angew. Chem., 101 (1989) 1017; Angew. Chem., Int. Ed. Engl., 28 (1989) 992. 9 G. Frenking and U. Pidun, J. Chem. Soc., Dalton Trans., in print. l0 (a) M. J. S. Dewar, Bull. Soc. Chim. Ft., 18 (1951) C71. (b) J. Chatt and L. A. Duncanson, J. Chem. Sot., (1953) 2939. ll H. A. Bent, Chem. Rev., 61 (1961) 275. 12 V. Jonas, C. Boehme and G. Frenking, Inorg. Chem., 35 (1996) 2097. 13 (a) D. M. Root, C. R. Landis and T. Cleveland, J. Am. Chem. Soc., 116 (1994) 4201. (b) C. R. Landis, T. Cleveland and T. K. Firman, J. Am. Chem. Sot., 117 (1995) 1859. 14 M. M. Schulte, E. Herdtweck, G. Raudaschl-Sieber and R. A. Fischer, Angew. chem., 108 (1996) 489; Angew. Chem., Int. Ed. Engl., 35 (1996) 424. 15 j. Weig, D. Stetzkamp, B. Nuber, R. A. Fischer, C. Boehme and G. Frenking, Angew. Chem., 109 (1997) 95; Angew. Chem., Int. Ed. Engl., 36 (1997) 70. 16 M. M. Schulte, R. A. Fischer, E. Herdtweck, L. Zsolnai, C. Boehme, S. F. Vyboishchikov and G. Frenking, J. Am. Chem. Soc., submitted for publication. 17 (a) C. Moller and M. S. Plesset, Phys. Rev., 46 (1934) 618.
570 (b) J. S. Binkley and J. A. Pople, Intern. J. Quantum Chem., 9 (1975) 229. 18 (a) L. Szasz, Pseudopotential Theory of Atoms and Molecules, Wiley, New York, 1985. (b) M. Krauss and W. J. Stevens, Annu. Rev. Phys. Chem., 35 (1984) 357. 19 (a) R. Ditchfield, W. J. Hehre and J. A. Pople, J. Chem. Phys., 54 (1971) 724. (b) W. J. Hehre, R. Ditchfield and J. A. Pople, J. Chem. Phys., 56 (1972) 2257. (c) M. M. Francl, W. J. Pietro, W. J. Hehre, J. S. Binkley, M. S. Gordon, D. J. Defrees and J. A. Pople, J. Chem. Phys., 77 (1982) 3654. 2o p. j. Hay and W. R. Wadt, J. Chem. Phys., 82 (1985) 299. 21 A. Bergner, M. Dolg, W. Ktichle, H. Stoll and H. PreufS, Mol. Phys., 80 (1993) 1431. 22 G. Frenking, I. Antes, M. B6hme, S. Dapprich, A. W. Ehlers, V. Jonas, A. Neuhaus, M. Otto, R. Stegmann, A. Veldkamp and S. F. Vyboishchikov, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd (eds.), VCH, New York, 1996, vol. 8. 23 CDA 2.1., S. Dapprich and G. Frenking, Marburg, 1994. The program is available via anonymous ftp server: ftp.chemie.uni-marburg.de (/pub/cda). 24 M. J. Frisch, G. W. Trucks, H. B. Schlegel, P. M. W. Gill, B. G. Johnson, M. A. Robb, J. R. Cheeseman, T. A. Keith, G. A. Petersson, J. A. Montgomery, K. Raghavachari, M. A. A1-Laham, V. G. Zakrzewski, J. V. Ortiz, J. B. Foresman, J. Cioslowski, B. B. Stefanov, A. Nanayakkara, M. Challacombe, C. Y. Peng, P. Y. Ayala, W. Chen, M. W. Wong, J. L. Andres, E. S. Replogle, R. Gomperts, R. L. Martin, D. J. Fox, J. S. Binkley, D. J. Defrees, J. Baker, J. J. P. Stewart, M. Head-Gordon, C. Gonzalez and J. A. Pople, GAUSSIAN 94, Revision C.2, Gaussian Inc., Pittsburgh, PA, 1995. 25 W. Kutzelnigg, Angew. Chem., 96 (1984) 262; Angew. Chem., Int. Ed. Engl., 23 (1984) 272. 26 p. Pyykk6, Chem. Rev., 88 (1988) 563. 27 (a) U. Pidun and G. Frenking, Organometallics, 14 (1995) 5325. (b) U. Pidun and G. Frenking, J. Organomet. Chem., 525 (1996) 269. 28 U. Pidun, C. Boehme and G. Frenking, unpublished results.
Z.B. Maksid and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Fundamental
Features
571
of H y d r o g e n B o n d s
Steve Scheiner D e p a r t m e n t of C h e m i s t r y , Mailcode 4409, S o u t h e r n Illinois University, Carbondale, IL 62901, U.S.A.
1. I N T R O D U C T I O N
More t h a n half a century has elapsed since Linus Pauling discussed his ideas about the fundamental nature and properties of the hydrogen bond in his l a n d m a r k book [1]. Pauling had based these ideas on the information that was available at the time. He made reference to papers going back as far as 1912 [2] in which the presence of such an interaction was inferred from the weakness of a particular base. In 1920, mention was made of the widespread occurrence of H-bonds [3], in connection with anomalous properties of certain liquids. Most of the information t h a t had accumulated up to 1940 was based on the s t r u c t u r e s found in crystals, infrared spectroscopic data in solution, and various other physical-chemical data. It is most remarkable that many of the ideas formulated in 1940 by Pauling about the hydrogen bond, and since tested by a panoply of newer experimental and theoretical methods, have shown themselves to be substantively correct. P e r h a p s even more impressive were Pauling's prophetic ideas about the importance of the H-bond. He stated in 1940 [1] t h a t "I believe t h a t as the methods of structural chemistry are further applied to physiological problems it will be found t h a t the significance of the hydrogen bond for physiology is greater t h a n that of any other single structural feature." The later award of a Nobel prize for unraveling the structure of the a-helix in proteins, which owes its existence largely to the H-bonds connecting successive turns, is t e s t a m e n t to this earlier claim. Subsequent work indicated that H-bonds play an important role in other aspects of protein structure, e.g. the ~-pleated sheet. This same phenomenon of H-bonding was shown to be crucial also in the transmission of genetic information, the basis of a Nobel prize for W a t s o n and Crick's formulation of the structure of DNA. Pauling's description of the hydrogen bond r e m a i n s valid today. He conceptualized this AH-..B interaction as largely ionic, with A and B both electronegative atoms. The strength of this interaction was suggested to rise with increasing electronegativity of A and B. Fluorine was thought to form the s t r o n g e s t H-bonds, followed by O and t h e n N. A l t h o u g h C1 h a s an
572 electronegativity roughly equal to that of N, the larger size of the former atom was proposed to lead to w e a k e r i n t e r a c t i o n s . Pauling's quantitative a s s e s s m e n t of the s t r e n g t h of the H-bond was quite accurate. His guess of some 5 kcal/mol t u r n s out to be quite close to the interaction energy in the w a t e r dimer, for example. In an interesting harbinger of a later controversy, Pauling suggested t h a t there is little fundamental distinction between a "true" H-bond and the simple Coulombic attraction of a pair of dipoles. In other words, Pauling reiterated his idea that the H-bond is primarily an electrostatic interaction between the charge distributions on the two partner molecules. A n u m b e r of aspects of H-bonds were s u m m a r i z e d in his book. Pauling mentioned the stretch of several hundredths of an ~mgstrom undergone by the A-H bond of the proton donor molecule upon forming a H-bond. The red shift of the stretching band of this bond was thought to be roughly proportional to the s t r e n g t h of the H-bond, a relationship known as the B a d g e r - B a u e r rule. A numerical increment of 1 kcal/mol was assigned to the H-bond s t r e n g t h for each 35 cm -1 reduction in this frequency. Pauling also pointed out the effects of an aromatic group, in that phenols formed stronger H-bonds t h a n did alcohols. In concert w i t h the s t r e n g t h e n i n g of the H-bond c a u s e d by c e r t a i n substitutions, P a u l i n g described a shortening of the distance between the nonhydrogen atoms. A more subtle feature described by Pauling later became known as the cooperativity effect: a chain of n consecutive H-bonds is generally stronger t h a n n individual H-bonds in isolation from one another. He delved into the u n u s u a l c h a r a c t e r i s t i c s of v e r y short, ionic H-bonds such as (FHF)-, questioning whether the equilibrium position of the proton is precisely midway between the two F nuclei. Pauling also indicated t h a t H-bonds did not have to involve two different molecules; it was deemed possible for two groups within the same molecule to form a H-bond with one another. This c h a p t e r celebrates the wisdom and chemical i n t u i t i o n of Linus Pauling. We will reexamine several of his notions about the hydrogen bond to see how well they have withstood the test of time, and the enormous amount of data accumulated from new experimental and theoretical methods t h a t were not available to him in 1940.
2. H Y D R O G E N B O N D I N G S T R E N G T H D a t a concerning the energetic aspects of hydrogen bonding were obtained indirectly at the time Pauling first wrote about these interactions. Moreover, the systems studied were in solution or in the solid phase, making it difficult to e x t r a c t the i n t r i n s i c p r o p e r t i e s of the H-bond, in isolation from other molecules. Gas-phase techniques have m a t u r e d since t h a t time, and are capable of very accurate m e a s u r e m e n t s of the equilibrium geometry of Hbonded dimers. A s s e s s m e n t of the energetics is less direct, however. Computational methods have also m a t u r e d quickly since 1940, fueled by rapid advances in computer h a r d w a r e and software, which have motivated major
573 strides in development of the formalism of electronic structure theory, coupled with code development. It has become possible in recent years to calculate the energetics of binding to an accuracy rivaling the best experimental techniques. As an added bonus, one computes this i n t e r a c t i o n e n e r g y on a p u r e l y electronic level, free of zero-point vibrational effects which can complicate interpretation of the experimentally determined interactions. The most accurate calculated values of the binding energies of a series of homodimers are presented in Table 1. Note that most of these values have still not been completely pinned down to the nearest 0.1 kcal/mol, but retain a range of uncertainty, depending upon the specific level of theory. Nonetheless, the trends are r a t h e r clear.
Table 1 Strengths of hydrogen bonds in homodimers (kcal/mol) Dimer
-AEelec a
Reference
(HF) 2 (H20) 2 (NH3) 2 (HC1) 2
4.5-5.0 4.7-5.0 2.8-2.9 1.6-2.0
[4-[9] [10-13] [14-16] [17-19]
aValues reported refer to calculated electronic contribution.
As predicted by Pauling, the H-bond formed by HC1 is the weakest in this series. Slightly stronger, with an interaction energy of less t h a n 3 kcal/mol, is the a m m o n i a dimer. Indeed, a series of inquiries beginning in the late 1980s was devoted to a d e t e r m i n a t i o n as to w h e t h e r the a m m o n i a dimer is held together by a H-bond, or by some other sort of force [20-24]. The hydrogen bonds holding together the HF and H20 dimers are considerably stronger, close to 5.0 kcal/mol. S o m e w h a t at odds with Pauling's original idea, there does not appear to be much of a distinction between the strengths of the H-bonds in the two latter dimers. This similarity is probably due to the fact t h a t while HF is indeed a better proton donor t h a n is H2 O, the better proton-accepting ability of H 2 0 makes up for this difference in the homodimer. The abilities of the various molecules to act in the capacity of either a proton donor or acceptor are illustrated by the assortment of mixed dimers listed in Table 2. Although the electronic contributions to the binding energies were not computed at the highest level of theory c u r r e n t l y available, we are most concerned here with a fair comparison. All values were computed at the same level, SCF, using the 6-31G** basis set.
574 Table 2 Strengths of hydrogen bonds a in mixed dimers (kcal/mol)
HF acceptor HF H20 NH 3
4.40 7.90 11.47
proton donor H20
2.64 4.60 5.68
NH 3
1.31 2.00 2.34
aData from [25]
In reading across a given row of Table 2, the proton donor is being changed, holding fixed the same acceptor. Reading left to right, the weakening of the Hbond in the following order is immediately apparent. donor: HF > H20 > NH 3
(1)
The opposite order, w h e r e i n the more basic molecule serves as the most efficient proton acceptor, is also obvious from examination of the columns of the table. acceptor: HF < H20 < NH 3
(2)
Another i m p o r t a n t observation relates to the fact t h a t the rate of change in each row is greater t h a n t h a t in a given column. In other words, the strength of the H-bond is more sensitive to the nature of the donor t h a n to the acceptor.
3. C O N T R I B U T I O N OF E L E C T R O S T A T I C S Another of Pauling's insights into the f u n d a m e n t a l n a t u r e of the H-bond was the dominating influence of electrostatics. While experimental methods are capable of assessing the magnitude of the interaction energy under certain conditions, they cannot determine the origin of this binding force, except by inference. For example, statistical analyses of the geometries of H-bonds in a large n u m b e r of crystals [26-28] indicate t h a t a proton donor prefers to approach the oxygen acceptor atom in a direction consistent with the lone pairs in a classical hybridization scheme. In the gas phase, too, one m a y describe the equilibrium geometries in terms of the approach of a proton to the lone pair of the acceptor, with the central idea that electrostatic forces draw the proton toward the lone pair where the electron density is maximal [29-32].
575 A more formal electrostatic picture which ignores the existence of lone pairs as such, and instead concentrates on the charge distribution of a given molecule in terms of atomic charges, molecular multipoles, and so on, has further underscored the importance of these Coulombic forces. Referring again to the carbonyl oxygen atom examined in the crystal structures, it is possible to reproduce these experimental trends by simply calculating the electrostatic forces between pairs of molecules [33]. One study demonstrated that the angular aspects of experimental equilibrium geometries of H-bonded complexes tend to coincide with the minimum of the calculated electrostatic interaction of the two molecules [34]. A very simple model was developed in the early 1980s [35,36] that evaluates the interaction between molecules on the basis of only their electrostatic interaction, and steric repulsions. The former was evaluated on the basis of point multipoles assigned to the atoms of each monomer; steric repulsions were modeled by simple hard spheres. This simple model successfully predicted the geometries of a number of different complexes, when measured against experimental structures. A more formal partitioning of the interaction energy into various terms [37] suggested one reason this model is so successful is t h a t the forces neglected in the electrostatic + steric interaction model tend to cancel one another with respect to angular dependence. Some estimate of the numerical accuracy of electrostatics in predicting the angular aspects of a H-bonded complex may be assessed from the data in Table 3. The calculated values were obtained by first estimating the various values of a multipole expansion of the charge distribution of any given molecule, i.e. dipole, quadrupole, octopole etc. [38]. The electrostatic interaction was then computed by allowing these moments to interact with one another, and truncating the entire series after a given term; the data listed in Table 3 were obtained by terminating at the R -5 level.
Table 3 Comparison of experimental angles with those computed from electrostatic forces a
complex (HF)2 (H20) 2 (NH3) 2 H20--HF H20--NH 3
acceptor angle expt calc 117 60 49 134 157
117 48 4 126 183
aAll values in degrees, data from [38]
donor angle expt calc 190 52 116 180 56
190 69 72 171 58
576 The HF dimer is predicted to nearly perfect accuracy by this simple electrostatic prescription. The prediction deteriorates somewhat for the water dimer, but falls apart entirely for (NH3) 2. The poor performance in the latter case is not surprising since the surface is extremely flat, even with the most precise full calculations available. Excellent agreement is obtained for the mixed dimers in the last two rows of Table 3. A later study, which carried the analysis one step further to R -6, indicated the series is not yet entirely stable, and changes of several degrees are to be expected at this level [39]. One can conclude t h a t in m a n y cases, an analysis of the electrostatic interaction between a pair of molecules will provide a very reasonable estimate of the angular aspects of their interaction. Exceptions will likely occur in the case of very flat surfaces, as in the case of the ammonia dimer, or in weak complexes where dispersion plays an increasingly important role. Table 4 provides some estimates of the m a g n i t u d e of the electrostatic interaction in a typical hydrogen bond, in comparison to other components. The specific system examined in Table 4 is the w a t e r dimer; the R(OO) distances examined are 3.25 ,~ and 2.75 ,~. The latter is approximately the interoxygen separation in ice; the two values bracket the equilibrium distance of 2.9-3.0 A in the gas-phase dimer, depending upon the level of theory. Table 4 Components of interaction energy in water dimer, and changes caused by 40 ~ angular distortion of proton donor or acceptor a component b
undistorted value R(O--O)= 3.25 .~ 2.75 .~ _
hE ES R -2 R3 R4 R -5 MP5 EX POL CT MIX
-6.22 -7.37 -5.73 -12.77 0.0 0.0 -2.92 -4.82 -1.71 -3.33 -0.51 -1.17 -5.14 -9.32 1.16 9.14 -0.37 -0.97 -1.18 -2.33 -0.10 -0.44
increment caused bu 40 ~ distortion proton donor proton acceptor 3.25 A 2.75/k 3.25 ,~ 2.75/k 4.43 4.34 0.0 3.43 -0.68 0.73 3.48 -0.76 0.21 0.54 0.09
6.24 9.54 0.0 5.67 -1.34 1.69 6.01 -5.56 0.57 1.29 0.40
2.84 2.93 0.0 2.06 1.17 -0.80 2.43 -0.15 0.11 -0.06 0.01
4.74 5.49 0.0 3.40 2.29 -1.84 3.84 -0.91 0.16 -0.05 0.04
aAll values in kcal/mol, data from [40] bAbbreviations are as follows" AE = total interaction energy, ES= electrostatic, MP = multipole series through R -5, EX = exchange, POL = polarization energy, CT = charge transfer, MIX = mixing term.
577 We concentrate first on the first two columns of data. The first row reports the total interaction energy, AE, where the negative sign indicates an attractive force. The electrostatic component of this total interaction is listed as the ES term in the next row. It is i m p o r t a n t to note t h a t whereas ES represents a reasonable approximation to the full AE for R=3.25 A, the compressed H-bond leads to a divergence in t h a t the ES term becomes much more negative t h a n AE. The next several rows investigate the adequacy of a multipole expansion of the electrostatic interaction. The R -2 term is identically zero since it represents the charge-dipole i n t e r a c t i o n and both w a t e r molecules are electrically neutral. The interaction between the two dipoles resides in the R -3 term. This interaction is attractive but makes up only a fraction of the full ES interaction. A second m a j o r a t t r a c t i v e component comes from the dipole-quadrupole interactions in the R -4 term, with a minor addition in the next term as well. When s u m m e d together, the first four terms in the multipole series comprise the MP5 term in the next row of Table 4. This truncated series provides an excellent estimate of the full ES component for R=3.25 A, but loses some of its accuracy when the two water molecules are pulled up on the repulsive part of their potential, at R=2.75 A. Components other t h a n electrostatic are listed in the next several rows of Table 4. EX refers to the exchange repulsion between the static charge clouds of the two monomers, roughly equivalent to what is commonly known as the steric repulsion. This term is r a t h e r small for 3.25 A but grows exponentially, increasing by nearly an order of magnitude when R is diminished by 0.5 A. It is this force which prevents the collapse of the two monomers into one another. The next two terms originate in the perturbation of the charge cloud of one monomer caused by the presence of the other. The polarization energy (POL) is related to charge r e a r r a n g e m e n t s within one monomer or the other, and charge transfer (CT) is due to displacements of electron density between the two subunits. Note t h a t both provide small attractive contributions. The last (MIX) term in Table 4 is a sum of higher-order terms that do not fit neatly into the above categories; it is rather small in this case. Examination of the first two columns of Table 4 indicates that for H-bonds t h a t are s t r e t c h e d s o m e w h a t beyond t h e i r e q u i l i b r i u m g e o m e t r y , the electrostatic t e r m furnishes an excellent estimate of the full i n t e r a c t i o n energy, and the former is in t u r n nicely reproduced by a truncated multipole series. The other contributions, EX, POL, CT, and MIX all m a k e smaller contributions which cancel to a large extent. The same is not true when the Hbond is compressed. Exchange repulsion grows rapidly and cannot be ignored. Moreover, the multipole series deviates significantly from the full ES component. As pointed out above, there is evidence that electrostatic forces dominate the n a t u r e of the equilibrium geometry of m a n y H-bonded complexes. The u n d e r l y i n g a s s u m p t i o n is t h a t these Coulombic forces are m u c h more anisotropic t h a n the other components, and therefore guide the a n g u l a r aspects of the structure. There is in fact verification of this contention at high levels of theory for systems like the water dimer [41].
578 The last four columns of Table 4 contain numerical d a t a t h a t largely support this contention. The entries r e p r e s e n t the changes in the various q u a n t i t i e s t h a t arise when either the proton donor or acceptor molecule is rotated by 40 ~ from the equilibrium geometry. For example, a 40 ~ distortion of the donor molecule, when R=3.25 .~, raises the energy of the system by 4.43 kcal/mol, hence diminishing the H-bond energy by this amount. The same distortion reduces the H-bond energy by 6.24 kcal/mol when the two waters are s e p a r a t e d by 2.75 .~. The last two columns indicate t h a t the interaction is somewhat less sensitive to 40 ~ rotations of the proton acceptor molecule. Comparison of the first and second rows of Table 4 indicates the strong parallel between the total interaction energy and its electrostatic component, especially when the H-bond is stretched to 3.25/k. For the compressed H-bond, the ES term exaggerates the change in the full AE. The changes in the POL, CT, and MIX t e r m s are r a t h e r small, and generally destabilizing. More significant is the exchange repulsion which is s u b s t a n t i a l l y relieved by the distortions which reduce the overlap between the charge clouds of the two monomers. Examination of the various contributors to ES reveals t h a t the dipole-dipole R -3 t e r m is particularly sensitive to angular distortions. Whereas the dipoleq u a d r u p o l e i n t e r a c t i o n s contained in the R -4 t e r m are also sizable, it is i m p o r t a n t to note t h a t they behave differently depending upon which molecule is rotated. T h a t is, the R -4 term produces a net stabilization if the donor is t u r n e d but adds to the destabilization of R -3 if the rotation occurs in the acceptor. Overall, the multipole series, truncated at R -5, provides a reasonable a p p r o x i m a t i o n of the full ES distortion energy, p a r t i c u l a r l y at the longer distance. This section concludes with the caveat t h a t any means of partitioning the total interaction energy into various quantities is arbitrary. It is hence possible t h a t w h e r e a s one p a r t i t i o n i n g method will designate electrostatics as a dominating contributor to H-bonding, another scheme may not. For example, the results presented in Table 4 were obtained by the widely used KitauraMorokuma decomposition scheme [42,43]. As does any partitioning procedure, this one has a n u m b e r of flaws, particularly in the separation of CT and POL energies [44,45]. On the other hand, a n a t u r a l bond orbital analysis of molecular interactions concluded t h a t charge transfer plays a very i m p o r t a n t role in formation of the H-bond in water dimer [46], consistent with a picture of H-bond formation which relies on HOMO-LUMO interactions between the proton donor and acceptor [47,48]. Of course, this scheme is also subject to criticism in a n u m b e r of respects [49].
4. R E L A T I O N S B E T W E E N VARIOUS P R O P E R T I E S T h e r e are a n u m b e r of c h a r a c t e r i s t i c properties of h y d r o g e n - b o n d e d systems. In the first place, the distance between the heavy atoms is expected to be shorter t h a n the sum of their van der Waals radii. Along with formation of the AH.-.B interaction comes a stretch of the bridging hydrogen away from the
579 A atom to which it is covalently attached. This AH bond is also involved in striking spectroscopic changes. Its stretching frequency, commonly referred to as Vs, is shifted by several hundred wave numbers to the red, while gaining in i n t e n s i t y and broadening. Formation of a H-bond leads to several n e w intermolecular vibrational modes, one of which can frequently be associated with a stretch in the intermolecular distance, v(~. The electron density shifts which arise from the H-bonding result in perturbations of the NMR proton shielding tensor, deshielding the bridging hydrogen. The q u a n t i t i e s mentioned above all a p p e a r to be correlated with one another, and can be used as indicators of the s t r e n g t h of the H-bond. For example, the magnitude of the Vs red shift was noted as an early indicator of the s t r e n g t h of the H-bond [50-52]. The intermolecular H-bond stretching frequency is also directly related to the s t r e n g t h of the H-bond. The NMR isotropic shielding and anisotropies tend to correlate with the length and s t r e n g t h of the H-bond [53-56] as do the peak volumes in the solid state h e t e r o n u c l e a r correlation spectra [57]. As the distance between the two subunits grows, the stretch of the A-H bond is diminished [58]. In one a t t e m p t to examine some of these relationships on a quantitative level, the geometry of the water dimer was optimized for a series of different R(O..O) distances [59]. This optimization led to an evaluation of both the stretch in the r(OH) bond, and to the energetics of the interaction at t h a t p a r t i c u l a r interoxygen separation. A n u m b e r of different basis sets were considered so as to generalize the results as much as possible. It was found t h a t there is a clear and very nearly linear relationship between the interaction energy and this bond stretch. J u s t as the energy, when plotted against R(O..O), shows the familiar Morse function-like behavior, so too does r(OH), albeit inverted. As the two water molecules are brought toward one another, r(OH) begins to stretch. The magnitude of this stretch is gradual at first, but becomes more steep as R(O..O) approaches its equilibrium separation. If the H-bond is squeezed further so t h a t R(O..O) is forced to be smaller t h a n its equilibrium value, r(OH) begins to shorten again. The various data points could be fit to a linear relationship between r(OH) and R(O..O), with a correlation coefficient exceeding 0.99 for any basis set examined. The slope of the line was such t h a t each 0.001 A stretch of r(OH) can be associated with a bond s t r e n g t h e n i n g of 2 kcal/mol. One can also describe r(OH) by a Morse function, similar to that which is typically applied to the energy: r = r d - (r d -rm)(1 - exp[-(x(R - R0)])2
(3)
where r d and r m are the rOH bond lengths in the fully optimized w a t e r dimer and monomer, respectively, and R refers to the interoxygen separation, with R o representing its equilibrium value. A graphic illustration of the correlations between some of the quantities mentioned above can be taken from a recent set of computations t h a t pair HC1
580 with a set of pyridines, each substituted in the 4-position with CN, F, el, H, and C H 3 [60]. Figure l a illustrates the nearly linear relationship between the strength of the hydrogen bond, -AEelec, and the stretch of the HC1 bond by the solid curve. An increase in the H-bond energy by less than 3 kcal/mol increases the stretch in this bond by some 0.025 .&. The broken curve in Fig; la reveals the contraction of the H-bond, with R(C1..N) lowering by about 0.1 A in this same interval.
a)
3.12
0.065
\ \\\
3.08 33 0
~" 0.055 v
O i._ I
0.045
~
0.035 9
'
b)1000
/
\ racI
R(CI..N) \ %,
I'0
\
\
I'I
'
'
3.04
=
v
k
123.00
-A Eelec (kcal/mol)
|
7'"
900
140
'7,
E ~ 800
120 ~> 3 _.
if)
AVs/.,,/
i 700
/
3 o
~./Amo .
100
600
500
'
1'0
'
1'1
-A Eelec (kcal/mol)
'
280
Figure 1. Correlations of a) geometrical and b) spectroscopic properties with H-bond strength in complexes of HC1 with 4-substituted pyridines [60].
581 Fig. lb illustrates the variation of the spectroscopic properties of the HC1 stretching vibration. The red shift of this band varies by more t h a n 300 cm -1 as a result of the aforementioned s t r e n g t h e n i n g of the H-bond by less t h a n 3 kcal/mol, as indicated by the solid curve in Fig. lb. The broken curve represents the intensification of this band, as a ratio of the intensity in the complex, compared to t h a t in the isolated HC1 monomer. Note t h a t this intensity rises by two orders of magnitude as a result of formation of the Hbond, and is roughly proportional to the strength of the interaction. As the proton affinity of the substituted pyridine increases beyond the range illustrated in Fig. 1, and as the H-bond is further strengthened, there comes a point where the bridging proton moves closer to the base t h a n to the halide atom. In such a case, the system is better described as an ion pair, e.g. C19--+HPyr. At this point, the stretching frequency t h a t is observed in the spectrum is no longer t h a t of C1-H, but r a t h e r +H-Pyr. According to data collected over the years for this set of systems [60], and m a n y others, a further e n h a n c e m e n t of the proton affinity of the base then leads to a progressive weakening of the A-...+HB interaction, since a stronger base B is equivalent to a w e a k e r acid +HB. As a result, the H-B frequency rises, i.e. its red shift diminishes, as B becomes more basic. Vibrational frequencies involving the A-H stretch are not the only modes t h a t are directly related to the strength of the H-bond. The acceptor molecule, too, is influenced in ways t h a t m i r r o r the interaction. For example, calculations [61] have illustrated that the shift in the C=O stretching frequency of a carbonyl acceptor is linearly related to the interaction energy in such a way t h a t each 1 kcal/mol increase in the binding energy results in a 2 cm -1 red shift. This sort of relationship is confirmed by experimental m e a s u r e m e n t s [62].
5. C O O P E R A T I V I T Y
The formation of a H-bond causes redistributions in the electronic structure of each subunit, and alters their polarizability. These perturbations lead to the ideas expounded by Pauling t h a t the ability of either of these two molecules to form a n o t h e r H-bond is altered by their participation in the first bond. Consider, for example, the pair of molecules AH and BH, each of which has a proton to donate in a H-bond, and each of which contains one or more lone electron pairs appropriate to accept a proton. If they form a H-bond of the type AH...BH, the proton of BH is still available to form a H-bond to a n o t h e r molecule, CH. But the CH molecule will encounter two different situations depending upon w h e t h e r the BH molecule is involved in the aforementioned dimer, or is a single isolated BH molecule. In fact, formation of the AH...BH complex will remove electron density from the BH subunit, density which is transferred across to AH. This loss of negative charge will make BH a more powerful proton donor so t h a t one can expect the BH...CH interaction in the A H . - . B H . . . C H t r i m e r to be stronger t h a n in the simpler BH...CH dimer.
582 Analogous reasoning would make the AH molecule in AH-.-BH a better proton acceptor, in comparison to the isolated AH molecule. These effects, t h a t m a k e the "whole larger t h a n the sum of its parts", wherein a chain of H-bonds is more strongly bound together t h a n any of the individual links would be in the absence of the others, is an expression of the "cooperative" n a t u r e of H-bonds. It is this cooperativity t h a t leads to the common occurrence of long strings of H-bonds. It has been noted from surveys of crystal studies, for example, t h a t H-bonds that occur as parts of such strings tend to be considerably shorter, and p r e s u m a b l y stronger, t h a n isolated Hbonds [63]. Rigorous calculations have confirmed this reasoning, indicating t h a t it is polarizability of this sort t h a t is largely responsible for the nonadditivity effects in H-bond chains [64]. B u t it should be e m p h a s i z e d t h a t multiple H-bonds are not a l w a y s cooperative in a positive sense. Again referring to the AH..-BH dimer, the electron removal from BH makes this molecule not only a better proton donor, but also a poorer proton acceptor. So BH would now be less inclined to accept a proton from another molecule like CH. For this reason, one would expect the total interaction energy in a trimer like t h a t in Fig. 2 to be weaker t h a n in the pair of dimers AH...BH and CH..-BH. This sort of general w e a k e n i n g is sometimes referred to oxymoronically as "negative cooperativity". \ H
B ~ H .o ooo oO
H C
J
Figure 2. Example of negative cooperativity where B serves as proton acceptor to both AH and CH. It is usually energetically unfavorable for a molecule to act as a double proton acceptor as BH would be in Fig. 2. For similar reasons, cooperativity is typically negative also when a molecule acts as double proton donor. Of course, even in the case of negative cooperativity, formation of the second H-bond is usually energetically favorable when compared to the complete absence of a second H-bond. That is, even though the CH-..BH interaction energy above is w e a k e r t h a n it would be in the absence of the other proton donor, AH, this interaction energy is still negative, and so will form spontaneously. In other words, two H-bonds are always better than one (or usually so). We focus here on HCN as an element in a chain of H-bonded units. The presence in this molecule of only one proton and one lone electron pair provides a simple testing ground for ideas about cooperativity.
5.1. G e o m e t r i e s Fig. 3 illustrates how the C-H and C-=N bonds change their length as the chain of (HCN) n is built up, one subunit at a time. In the j u m p from the
583
monomer to dimer, Fig. 3a illustrates the expected elongation of the C-H bond in the proton donor molecule, and the concomitant smaller increase in the C-H bond length of the proton acceptor molecule. As the chain gets longer, the donor designation indicates the leftmost molecule of NCH"(NCH)n_2"'NCH, and the furthest right molecule is considered the acceptor. Fig. 3a indicates
a) 1.064
..-'~]ddle
1.062 -!- 1.060 o I,..
1.058 1.056 1.054
b ) 1.142
~
1.141 z o l .140
\
1.139
\
'~-donor
middle
9. . . . . . . . . . . . . . . . . . . . . . . .
\
\ acceptor
--__ 1.138
I
~'
n
3
4
5
Figure 3. Optimized lengths of a) C-H and b) C - N bonds in linear a r r a n g e m e n t s of (HCN)n, n=1-5. Donor refers to the leftmost molecule and acceptor to the rightmost in NCH"(NCH)n_2..NCH. The middle designation indicates the central molecule for n=3 and n=5. Data taken from SCF optimizations with a [53/3] basis set in [65].
584 t h a t t h e s e elongation t r e n d s continue with each additional molecule added in t h e m i d d l e of t h e chain, a l t h o u g h t h e s e bond l e n g t h s slowly a p p r o a c h a s y m p t o t e s as n continues to increase. For odd n, we also consider the central molecule which h a s an equal n u m b e r of molecules on its left as on its right. The d a t a points in Fig. 3a show how the elongating effects of being a donor or acceptor reinforce one a n o t h e r in the middle molecule which s i m u l t a n e o u s l y plays both roles. Hence, the C-H bonds are longest in the central molecule. Analogous d a t a are presented in Fig. 3b for the CN bonds in the H C N linear chains. In this case, the act of donating a proton m a k e s the CN bond longer while it becomes s h o r t e r if the molecule acts as acceptor. Again, the bond l e n g t h s a p p r o a c h a s y m p t o t e s for the t e r m i n a l molecules as the chain grows longer. As before, the central molecule acts both as proton donor and acceptor, but in this case the middle molecule shows little modification as compared to the m o n o m e r , since these two roles produce opposite effects on the CN bond length.
5.2. Energeties The binding energies of the various linear oligomers are listed in Table 5. The first two columns report the energetics of a s s e m b l i n g each complex from n isolated m o n o m e r s so r e p r e s e n t the total b i n d i n g e n e r g y a n d e n t h a l p y , respectively. -AEelec for assembly of the full t r i m e r is 12.5 kcal/mol. Since this total is more t h a n twice t h a t of-AEelec for the dimer, the two H-bonds in the t r i m e r are stronger in s u m t h a n two isolated H-bonds, as would occur in a pair of dimers. This difference is reported in the next two columns of Table 5 as the cooperativity. Hence, for either AEelec or AH ~ the binding energy of the t r i m e r exceeds t h a t of two dimers by 1.3 kcal/mol.
Table 5 E n e r g e t i c s of binding (kcal/mol) in linear oligomers (HCN) n, as calculated at the SCF level by [65]. coop a n
-AEelec
-AH ~
2 3 4 5
5.60 12.52 19.89 27.47
4.72 10.75 17.21 23.87
-AEelec 1.32 1.55 1.69
-AQn/(n-1)
-AH ~ 1.31 1.53 1.66
-AEelec 5.60 6.26 6.63 6.87
-AH ~ 4.72 5.38 5.74 5.97
acooperativity is defined here as follows: if AQ n is the property of i n t e r e s t for (HCN)n, coop is evaluated as [AQn - (n-1)AQ 2] / (n-2)
585 The series progresses to n=4 in the next row where total binding energies are nearly 20 kcal/mol. These totals exceed the sum of three dimers by 3 kcal/mol. The latter amount is divided by two in the next two columns of Table 5 to facilitate comparison with the trimer in the preceding row. That is, there are two "trimers" present within the t e t r a m e r so the quantity is divided by 2 for a fair comparison. Note t h a t these cooperativities of 1.5 kcal/mol are larger t h a n the values for the trimer. The cooperativity increases f u r t h e r in the p e n t a m e r . One can imagine t h a t these cooperativities continue to increase while approaching an asymptote as n~o~. The convergence of the binding energy with chain length can be gleaned from the last two columns of Table 5. -AQn/(n-1) refers to the average H-bond e n e r g y of a given oligomer w h e r e the full i n t e r a c t i o n e n e r g y of the n monomers is divided by the n u m b e r of pairs in the oligomer. For example, while -AEelec is 5.6 kcal/mol for the dimer, the average interaction energy of the four H-bonds present in (HCN) 5 is approaching 7 kcal/mol.
5.3. V i b r a t i o n a l S p e c t r a The vibrational frequencies are not so clearly identified with any particular single bond as are geometrical features. Nevertheless, one is able to identify various modes as largely C-H or C - N stretches, or intramolecular bends, even in the longer chains. With respect to the C-H stretches, the longer oligomers exhibit a spread of frequencies. The highest of these is fairly clearly identified with the proton acceptor molecule and the others are within a narrow range of each other. We denote the lowest frequency as t h a t of the t e r m i n a l donor molecule w i t h the caveat t h a t this designation is s o m e w h a t a r b i t r a r y . Analogous reasoning was applied to the CN stretches. The calculated frequencies exhibit behavior very much like the bond lengths in some ways. For example, the CH stretching frequencies in Fig. 4a suffer decreases as the chain grows, with this red shift being more pronounced for the proton donor end of the chain. These frequency drops are consistent with the bond stretches described in Fig. 3a. In fact, the two plots are nearly perfect mirrors of one another. The CN stretching frequency of the donor end of the chain, illustrated in Fig. 4b, diminishes with larger n, also consistent with the analogous bond stretches of Fig. 3b; the behavior of the acceptor is again opposite in sign. Note the approach to an asymptote for all frequencies in Fig. 4. Another important aspect is the magnitudes of changes in bond length and frequency. The CH bonds stretch by nearly 0.01 A upon forming the serial Hbonds; the red shift of the associated frequency is some 150 cm -1. In contrast, the CN bonds change their length by less t h a n 0.002/k, with shifts of only 20 cm -1 or so.
586
a ) 3700 3650 "T,
E ..a 3600 I o >
3550 donor 3500 3450
1 n
b) 236o E
o. 2340 1
Z
o
/
--acceptor
/
>
2320
2300
1
2
3
4
n
Fig. 4. Calculated s t r e t c h i n g frequencies of a) C-H and b) C - N bonds in linear a r r a n g e m e n t s of (HCN)n, n = l - 5 . Donor refers to the l e f t m o s t molecule ( a s s u m e d to be of the lowest frequency) and acceptor to the r i g h t m o s t (highest v) in NCH"(NCH)n_2..NCH. D a t a derived at SCF/[53/3] level in [65].
5.4. Energy Components J u s t as in the case of the dimers discussed above, it is possible to decompose the e n e r g y of a cluster of n molecules so as to extract i n f o r m a t i o n about the u n d e r l y i n g cause of the cooperativity t h a t is observed. The condensed s t a t e of w a t e r serves as p e r h a p s the m o s t ubiquitous s i t u a t i o n w h e r e cooperativity
587 exists, and is of greatest interest to biology and Pauling's predictions about the physiological importance of H-bonds. The water molecules in ice are arranged in what can be described as hexamers, as illustrated in Fig. 5. Note that this structure is not of the purely sequential type where all six molecules act as both donor and acceptor: one molecule (#5) serves as double donor and another (#4) as double acceptor. / O s ~
..
d
.N
ss
~4 . . . . . . . . . . .., ,,.' "
I
03
"., ,,,, .,. ,,. ,,. ,,.
Fig. 5. Proton donor and acceptor characteristics in water hexamer examined in [66].
The two and three-body interaction energies in this water hexamer are listed in Table 6. The two-body term is defined as the interaction energy computed for any pair of subunits, in the absence of any others, and in the geometry adopted in the oligomer. Three-body terms refer to the total interaction energy of any given triad of subunits, minus the sum of the three two-body interactions present in this same triad. In essence, the three-body term is similar to the cooperativity parameters described in Section 5.2, except t hat the geometries are not reoptimized for each of the monomer, dimer, and trimer, but are instead all frozen in the structure of the oligomer. Along with the total interaction energies listed in the first column of data in Table 6, this term (two or three-body) is decomposed in the following columns into its electrostatic, exchange, polarization, and charge transfer components, using the same formalism as mentioned in Section 3. Beginning our discussion with the two-body terms in the upper part of Table 6, the results for all adjacent molecules are identical to the data for the 12 pair in the first row of the table. This similarity arises because all adjacent pairs constitute a single H-bond; the concept of double donor or acceptor is only meaningful within the context of three or more molecules. The interaction energy amounts to -2.8 kcal/mol. The attractive electrostatic energy is canceled by the exchange repulsion; polarization and charge transfer energies are both attractive. For all nonadjacent pairs, the EX, POL, and CT terms are quite small, leaving only electrostatic energy in the pairwise interactions. Because of the long-range character of the ES term, there are significant contributions even from molecules on opposite ends of the ring, e.g. 1-4 or 3-6. The signs of these nonadjacent pairwise electrostatic energies can be understood on the basis of the orientations of the particular molecules. For example, molecules 3 and 5 have H atoms pointed at one another, leading to the repulsive 3-5 term. When summed together, the two-body terms amount to -19.2 kcal/mol, less attractive by 4 kcal/mol than the full interaction energy in the hexamer (see
588 last row). With respect to the individual components, the electrostatic term is by n a t u r e fully additive, so the sum of two-body terms is equal to the full ES e n e r g y of the h e x a m e r . The exchange is very n e a r l y additive, with a discrepancy of only 0.1 kcal/mol. The sum of two-body polarization and charge t r a n s f e r components are each about 2 kcal/mol less attractive t h a n the full components in the hexamer.
Table 6 Components (kcal/mol) of two and three-body interactions computed for the water hexamer illustrated in Fig. 5. Data from [66]. AEelec
ES
EX
POL
CT
2-body terms 1-2 1-3 2-4 3-5 4-6 1-5 1-4 2-5 3-6 SUM (]~AE2)
-2.8 -1.2 -0.6 0.8 0.4 -0.6 02 0.2 -0.4 -19.2
-13.1 -1.1 -0.6 0.8 0.4 -0.6 0.2 0.2 -0.4 -81.0
14.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 87.0
-1.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -8.4
-2.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -16.8
3-body terms 1-2-3 2-3-4 3-4-5 4-5-6 1-3-5 2-4-6 1-24 1-3-4 1-4-5 2-5-6 SUM (EAE 3)
-1.4 -1.0 0.8 1.2 0.0 0.0 -0.1 -0.1 0.2 -0.2 -3.8
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 -0.1 0.2 0.0 0.0 0.0 0.0 0.0 0.0 -0.1
-0.8 -0.6 0.5 0.4 0.0 0.0 0.0 -0.1 0.1 -0.1 -2.2
-0.6 -0.4 0.4 0.5 0.0 0.0 0.0 0.0 0.1 0.0 -1.4
]~AE2 + ZAE 3
-23.0
-81.0
86.9
-10.6
-18.3
total in hexamer
-23.4
-81.0
86.9
-10.9
-18.5
The three-body t e r m s are listed in the lower part of Table 6. The first several entries r e p r e s e n t triplets of consecutive molecules a r o u n d the ring; these interactions can be of either sign. Repulsive terms are associated with triplets like 3-4-5 and 4-5-6 t h a t contain either a double-donor or double-
589 acceptor; others are attractive. The ES contributions are identically zero and the exchange Components are quite small. The three-body terms are composed of similar amounts of polarization and charge transfer components. It is worth noting that some of the three-body terms, e.g. 1-2-3, are of larger magnitude t h a n certain pairwise interactions, particularly those between nonadjacent pairs. Nonconsecutive triplets may contain no adjacent pairs, as in 2-4-6, or one adjacent pair, e.g. 1-2-4. In the former case, the three-body energies are less than 0.1 kcal/mol; the latter are all less than 0.2 kcal/mol. The total of all three-body interactions is -3.8 kcal/mol, as compared to -19.2 kcaYmol for the sum of all two-body interactions. When added together, the total of all pairwise and three-body interactions comes within 0.4 kcal/mol of the total interaction energy of-23.4 kcaYmol in the hexamer. With respect to the individual components, there is very little nonadditivity in ES or EX. The total nonadditivity of some 4 kcal/mol is approximately equally divided between POL and CT. 6. SUMMARY It is remarkable that many of the ideas formulated over fifty years ago by Linus Pauling about hydrogen bonds remain valid today, after accumulation of r e a m s of q u a n t i t a t i v e d a t a from both e x p e r i m e n t a l and theoretical perspectives. The correlation between the strength of the H-bond and the electronegativity of the atoms involved has been amply confirmed, with evidence suggesting a greater sensitivity to the nature of the donor than to the acceptor. Pauling's notion t h a t H-bonding is largely an electrostatic phenomenon has also received affirmation; in fact, a purely Coulombic analysis can frequently predict with good accuracy the angular aspects of a given complex. Certain key geometric and spectroscopic characteristics of Hbonds, already considered by Pauling, bear strong relationships to the strength of the interaction, and can be used as indicators of the H-bond strength in the absence of energetic data. Since Pauling's early discussion of the issue of multiple H-bonds, much work has elucidated the quantitative aspects of the cooperativity and calculations have determined its underlying causes. The aforementioned relationships between bond strength and geometric and spectroscopic properties remain valid in chains of H-bonds and these quantities bear a direct relationship with the degree of cooperativity. REFERENCES
1. L. Pauling, The nature of the chemical bond (Cornell University Press, Ithaca, NY, 1940). 2. T.S. Moore and T.F. Winmill, J. Chem. Soc., 101 (1912) 1635. 3. W.M. Latimer and W.H. Rodebush, J. Am. Chem. Soc., 42 (1920) 1419. 4. D.W. Michael, C.E. Dykstra and J.M. Lisy, J. Chem. Phys., 81 (1984) 5998.
590 5.
M.J. Frisch, J.E. Del Bene, J.S. Binkley and H.F. Schaefer, J. Chem. Phys., 84 (1986) 2279. 6. J.E. Del Bene, J. Chem. Phys., 86 (1987) 2110. 7. I~A. Peterson and T.H.J. Dunning, J. Chem. Phys., 102 (1995) 2032. 8. M. Quack and M.A. Suhm, Theor. Chim. Acta, 93 (1996) 61. 9. C.L. Collins, K. Morihashi, Y. Yamaguchi and H.F. Schaefer, J. Chem. Phys., 103 (1995) 6051. 10. M.W. Feyereisen, D. Feller and D jk. Dixon, J. Phys. Chem., 100 (1996) 2993. 11. G. de Oliveira and C.E. Dykstra, J. Mol. Struct. (Theochem), 337 (1995) 1. 12. Y.-B. Wang, F.-M. Tao and Y.-K. Pan, J. Mol. Struct. (Theochem), 309 (1994) 235. 13. S. Saeb~, W. Tong and P. Pulay, J. Chem. Phys., 98 (1993) 2170. 14. D.M. Hassett, C.J. Marsden and B.J. Smith, Chem. Phys. Lett., 183 (1991) 449. 15. F.-M. Tao and W. Klemperer, J. Chem. Phys., 99 (1993) 5976. 16. S.M. Cybulski, Chem. Phys. Lett., 228 (1994) 451. 17. F.-M. Tao and W. Klemperer, J. Chem. Phys., 103 (1995) 950. 18. A. Karpfen, P.R. Bunker and P. Jensen, Chem. Phys., 149 (1991) 299. 19. Z. Latajka and S. Scheiner, Chem. Phys., 122 (1988) 413. 20. D.D.J. Nelson, G.T. Fraser and W. Klemperer, J. Chem. Phys., 83 (1985) 6201. 21. M.J. Frisch, J.A. Pople and J.E. Del Bene, J. Phys. Chem., 89 (1985) 3664. 22. Z. Latajka and S. Scheiner, J. Chem. Phys., 84 (1986) 341. 23. J.W.I. van Bladel, A. van der Avoird, P.E.S. Wormer and R.J. SaykaUy, J. Chem. Phys., 97 (1992) 4750. 24. J.G. Loeser, C.A. Schmuttenmaer, R.C. Cohen, M.J. Elrod, D.W. Steyert, R.J. Saykally, R.E. Bumgarner and G.A. Blake, J. Chem. Phys., 97 (1992) 4727. 25. G. Alagona, C. Ghio, R. Cammi and J. Tomasi, Int. J. Quantum Chem., 32 (1987) 207. 26. R. Taylor, O. Kennard and W. Versichel, J. Am. Chem. Soc., 105 (1983) 5761. 27. R. Taylor and O. Kennard, Acc. Chem. Res., 17 (1984) 320. 28. P. Murray-Rust and J.P. Glusker, J. Am. Chem. Soc., 106 (1984) 1018. 29. A.C. Legon and D.J. MiUen, Faraday Discuss. Chem. Soc., 73 (1982) 71. 30. A.C. Legon and D.J. Millen, Acc. Chem. Res., 20 (1987) 39. 31. A.C. Legon and D.J. Millen, Chem. Soc. Rev., 16 (1987) 467. 32. M.T. Carroll, C. Chang and M.F.W. Bader, Mol. Phys., 63 (1988) 387. 33. J.B.O. Mitchell and S.L. Price, Chem. Phys. Lett., 154 (1989) 267. 34. J.T. Brobjer and J.N. Murrell, J. Chem. Soc., Faraday Trans. 2, 79 (1983) 1455. 35. A.D. Buckingham and P.W. Fowler, J. Chem. Phys., 79 (1983) 6426. 36. A.D. Buckingham and P.W. Fowler, Can. J. Chem., 63 (1985) 2018. 37. A.P.L. Rendell, G.B. Bacskay and N.S. Hush, Chem. Phys. Lett., 117 (1985) 400. 38. V. Magnasco, C. Costa and G. Figari, J. Mol. Struct. (Theochem), 169 (1988) 105. 39. V. Magnasco, C. Costa and G. Figari, Chem. Phys. Lett., 160 (1989) 469.
591 40. S.M. Cybulski and S. Scheiner, J. Phys. Chem., 93 (1989) 6565. 41. G. Chalasinski, M.M. Szczesniak, P. Cieplak and S. Scheiner, J. Chem. Phys., 94 (1991) 2873. 42. K. Morokuma and K. Kitaura, in: Chemical Applications of Atomic and Molecular Electrostatic Potentials, ed. P. Politzer and D.G. Truhlar (Plenum, New York, 1981) p. 215. 43. K. Morokuma and K. Kitaura, in: Molecular Interactions, ed. H. Ratajczak and W.J. Orville-Thomas Vol. 1 (Wiley, New York, 1980) p. 21. 44. R.F. Frey and E.R. Davidson, J. Chem. Phys., 90 (1989) 5555. 45. S.M. Cybulski and S. Scheiner, Chem. Phys. Lett., 166 (1990) 57. 46. A.E. Reed and F. Weinhold, J. Chem. Phys., 78 (1983) 4066. 47. S.J. Harris, K.C. Janda, S.E. Novick and W. Klemperer, J. Chem. Phys., 63 (1975) 881. 48. F.A. Baiocchi and W. Klemperer, J. Chem. Phys., 78 (1983) 3509. 49. W.J. Stevens and W.H. Fink, Chem. Phys. Lett., 139 (1987) 15. 50. R.M. Badger and S.H. Bauer, J. Chem. Phys., 5 (1939) 839. 51. G.C. Pimentel and A.L. McClellan, The Hydrogen Bond (Freeman, San Francisco, 1960). 52. C. Laurence, M. Berthelot, M. Helbert and K. Srafdi, J. Phys. Chem., 93 (1989) 3799. 53. E.E. Tucker and E. Lippert, in: The Hydrogen Bond. Recent Developments in Theory and Experiments, ed. P. Schuster, G. Zundel, andC. Sandorfy Vol. 2 (North-Holland Publishing Co., Amsterdam, 1976) p. 791. 54. H. Koller, R.F. Lobo, S.L. Burkett and M.E. Davis, J. Phys. Chem., 99 (1995) 55. H. Eckert, J.P. Yesinowski, L.A. Silver and E.M. Stolper, J. Phys. Chem., 92 (1988) 2055. 56. R. Kaliaperumal, R.E.J. Sears, Q.W. Ni and J.E. Furst, J. Chem. Phys., 91 (1989) 7387. 57. Z. Gu, C.F. Ridenour, C.E. Bronnimann, T. Iwashita and A. McDermott, J. Am. Chem. Soc., 118 (1996) 822. 58. I. Olovsson and P.-G. JSnsson, in: The Hydrogen Bond. Recent Developments in Theory and Experiments, ed. P. Schuster, G. Zundel, andC. Sandorfy Vol. 2 (North-Holland Publishing Co., Amsterdam, 1976) p. 393. 59. X. Duan and S. Scheiner, Int. J. Quantum Chem., QBS, 20 (1993) 181. 60. J.E. Del Bene, W.B. Person and K. Szczepardak, Mol. Phys., 89 (1996) 47. 61. Z. Latajka and S. Scheiner, Chem. Phys. Lett., 174 (1990) 179. 62. R. Thijs and T. Zeegers-Huyskens, Spectrochim. Acta A, 40 (1984) 307. 63. C. Ceccarelli, G.A. Jeffrey and R. Taylor, J. Mol. Struct., 70 (1981) 255. 64. G. Chalasinski and M.M. Szczesniak, Chem. Rev., 94 (1994) 1723. 65. M. Kofranek, H. Lischka and A. Karpfen, Chem. Phys., 113 (1987) 53. 66. J.C. White and E.R. Davidson, J. Chem. Phys., 93 (1990) 8029.
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modem Modelling of the Chemical Bond
593
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Molecular Similarity and Host-Guest Interactions Paul G. Mezey Mathematical Chemistry Research Unit, Department of Chemistry and Department of Mathematics and Statistics University of Saskatchewan, 110 Science Place, Saskatoon, SK, Canada, S7N 5C9
ABSTRACT
Quantum chemical treatment of host-guest interactions can be approached from a new perspective provided by the extension of electron density shape analysis methods to large systems. Host-guest interactions are manifested in the changes of electron densities detectable in the composite host-guest system, as compared to the electron densities of the individual, non-interacting host and guest molecules. The similarities and dissimilarities between the electron densities of interacting and non-interacting molecules provide quantum chemical descriptors of host-guest interactions. Some of the computational techniques relevant to such analyses are reviewed.
594
INTRODUCTION
Host-guest interactions are fundamental in enzyme actions and in many other biochemical processes [1-10]. In most instances, host-guest interactions involve special aspects of molecular similarity [11-26] and complementarity: interfacing between intermediate-size molecular regions. The range of features where a formal match between the shape properties and electronic properties is required [27] falls between the truly local interactions between individual atom pairs or small molecular fragments and the global interactions of extensive regions of molecules. This intermediate aspect of the range of interactions in host-guest problems provides both a challenge and a fertile ground for novel ideas of representing molecules and their electron density clouds. The approach followed in this contribution is motivated by a rather trivial observation: molecular properties are determined by the electron density distribution [27]. With the introduction of techniques suitable for the generation of ab initio quality electron densities for large molecules such a proteins, the range of electron density analysis has been extended from small systems to molecules of virtually any size. Host-guest interactions often involve a large host molecule and a usually smaller guest molecule, hence the role of large molecule electron density analysis is of special importance in this field. Of course, in most instances, only a small part of the large host molecule participates directly in the host-guest interactions, yet the entire host is often required in order to provide the proper framework and geometrical constraints for the interactions.
595 The first computations of ab initio quality electron densities of large molecules such as proteins have been based on the Additive Fuzzy Density Fragmentation (AFDF) recently [31,32].
principle [28-32], a technique that has been reviewed
The first
AFDF
approach was the MEDLA
method
(Molecular Electron Density "Loge" Assembler, or Molecular Electron Density "Lego" Assembler method)
of Walker and Mezey [33-38], where a pre-
calculated, numerical electron density fragment database is used. The entries in this database are custom-made electron density fragments, obtained from ab initio computations for small molecules which contain the molecular fragment within a local environment reproducing its actual environment within the macromolecule. A more advanced AFDF approach, the ADMA method (Adjustable Density Matrix Assembler method) [39-41]
is based on a density matrix
database and the actual construction of a macromolecular density matrix. This technique, also reviewed in part in ref.[31 ], is suitable for the rapid computation of various additional molecular properties besides electron densities. According to detailed test calculations, [33,34,37], the AFDF method produces
nearly 6-31G**
ab initio
quality electron densities for large
molecules, and is superior to conventional ab initio computations using the smaller standard basis sets. The tests included detailed comparisons of electron densities obtained for the amino acid ~-alanine [33], the model peptide system of glycyl-alanine [34], the reproducibility of the electron density of a H-bond in a helical tetrapeptide [34], the reproducibility of a non-bonded interaction between a sulfur atom and a phenyl ring in a molecular fragment from the pentapeptide metenkephalin [34], the reproducibility of aromatic rings and substituent effects in a series of aromatic molecules [37]. The numerical tests relied on direct, point-by-point comparisons of three-dimensional density grids,
596
and also integrated similarity measures [37] including the Carb6 quantum similarity index.
FROM
FUNCTIONAL
GROUPS
TO EXTENDED
MOLECULAR
REGIONS
Two important tools of electron density modeling and shape analysis are the concepts of molecular-isodensity contour, MIDCO G(K,a) and the associated
density domain DD(K,a) [27]. Here the nuclear arrangement (also called the nuclear configuration) is denoted by K, whereas the electron density threshold is denoted by a. Each MIDCO G(K,a) is the collection of all those points r of the three-dimensional space where the electron density molecule M of conformation K domain
p(K,r)
of the
is equal to the threshold a. The density
DD(K,a) is the collection of all points r where the electron density
p(K,r) is greater than or equal to the threshold a. Using formal notations,
G(K,a) = { r : p(K,r)= a },
(1)
DD(K,a) = { r : p(K,r) _>a }.
(2)
and
The MIDCO
G(K,a)
can be regarded as the boundary surface of the density
domain DD(K,a).
Quantum chemical functional groups have been defined as fuzzy electron density fragments (AFDF fragments) associated with a family of nuclei fk,
597 where there exists some density threshold a such that this family
fk is
separated from the rest of the nuclei of the molecule by the corresponding MIDCO
G(K,a).
This definition reflects the limited autonomy and separate
identity of functional groups within molecules, just as the existence of MIDCOs separating the nuclei of two different molecules placed within a short distance of one another reflects the autonomy and separate identity of the two molecules. Consider a macromolecule M of some nuclear configuration K. For an electron density threshold a, the family of functional groups
F1, F2, . . . , Fm
(3)
containing the nuclear families
fl, f2, . . . , fm
(4)
are characterized by the corresponding density domains
DD l(a,K), DD2(a,K), . . . , DDm(a,K)
(5)
which, by definition, must appear as separate entities. Using the additive, fuzzy density fragmentation (AFDF) approach of the macromolecular density
pM(r),
the fuzzy fragment electron density
contributions
PFI(r), PF2(r), . . . PFi(r),... PFm(r),
(6)
598 can be regarded as the representations of the "share" of each functional group F i within the total macromolecular electron density p M(r). If the nuclear sets fl, f2 . . . .
, fm contain all the nuclei of the molecule,
then one may reconstruct the electron density pM(r)
of macromolecule M by
a simple superimposition of the fuzzy fragment densities PFl(r), PF2(r),
999
P F i ( r ) , . . . PFm(r), of the actual family of functional groups:
pM(r) = ]~i PFi(r) 9
(7)
This approach can be easily generalized for much larger molecular moieties, for example, to entire reactive regions of a macromolecule, representing the "host" region of the molecule, or to the reactive region of the "guest" molecule. Furthermore, the interacting host-guest structure itself can be regarded as a formal "region" of a supermolecule, and the electron density shape analysis can be carded out accordingly. If one drops the condition that there must exist a density domain that contains all the nuclei involved in the molecular moiety considered, then the treatment can be made very general, and applied to virtually any molecular moiety. The given macromolecule M of the specified nuclear configuration K can be regarded as a composite of several regions, where each individual region might be much more extensive than individual functional groups, and where the requirement for "autonomous" density domains for each contributing molecular region no longer applies. For a specified electron density threshold a, the regions
R1, R2, . . . , Rr
(8)
599 contain the nuclear families
fl, f2, . . . . f r ,
(9)
respectively. The additive, fuzzy density fragmentation (AFDF) method applies to the fuzzy regional electron density contributions
PRl(r), PRz(r), . . . PRa(r),... PRr(r),
(10)
which collectively reproduce the macromolecular electron density pM(r).
The
individual regional densities pRj(r) within the total macromolecular electron density pM(r) can be regarded as the quantum chemical representations of the electron density "share" of each region Ri. Following a treatment similar to that applied to functional groups, we assume that the nuclear sets macromolecule
M.
fl, f2 . . . .
, fr contain all the nuclei of the
We may reconstruct the electron density p M(r)
of
macromolecule by superimposing the fuzzy regional densities p Rl(r), PRz(r), 9-- P Ri(r) . . . . pR~(r):
pM(r) = ]~i PR~(r) 9
(11)
The shape analysis and shape comparisons of electron densities of molecular regions provide information relevant to their interactions. In the next ,b
section a brief review of a shape analysis method is given.
600 E L E M E N T S OF E L E C T R O N DENSITY SHAPE ANALYSIS
The complete, three-dimensional shape of fuzzy electron densities can be described in detail using the Shape Group Methods, (SGM), reviewed in detail in a recent monograph [27]. Here, only a brief summary of this method will be given.. The application of SGM to molecular electron densities is based on the comparison of local curvatures of a range of MIDCOs to a range of reference curvatures, involving several steps. First, two ranges are selected: a range of electron density thresholds a and a range of reference curvatures b. For each pair of values
a
and
b within these ranges,
each MIDCO
G(K,a)
is
partitioned into local curvature domains relative to each value b, specifying whether the MIDCO G(K,a) is convex, concave, or of the saddle type relative to this curvature b.
In practice, the local curvature of a MIDCO
surface
G(K,a) at each point r is characterized by a local curvature matrix called the local Hessian matrix. Depending on the local relative convexity, the points are classified into curvature domains of the types,
D0(b),
D l(b)
or D2(b),
respectively. This is accomplished by comparing the local canonical curvatures (the eigenvalues of the local Hessian matrices) at each surface point r to the reference curvature b; a point r of G(K,a) is assigned to a Do(b), D l(b) or D2(b) curvature domain, if none, one, or two (respectively) of the eigenvalues of the local Hessian matrix of the surface at point curvature parameter
b.
r
are smaller than the
The computational problem is simplified by the fact
that for the identification of all the topologically different patterns of curvature domains only a finite number of (a,b) pairs need to be considered.
601
The next step involves a truncation of MIDCO surfaces according to the various curvature domains identified on them. For each (a,b) pair of values, all curvature domains D~t(b) of a specified type removed from the MIDCO G(K,a),
~t (usually, the type l.t = 2) are
and a truncated surface
G(K,a,~t) is
obtained. Note that for the whole range of parameter values a and b of each molecule M, only a finite number of topologically different truncated surfaces are obtained. In the final step, the shape groups of the entire molecular electron density distribution are computed. By definition, the shape groups are the algebraic homology groups of the truncated surfaces, which are invariants within each topological equivalence class of these surfaces. The ranks of these homology groups are the Betti numbers, serving as a set of numerical shape descriptors for the entire range of MIDCOs G(K,a) of the molecule M. For electron density analysis in three dimensions, using two-dimensional MIDCO surfaces, there are three types of shape groups, one for each of the dimensions zero, one, and two. The associated Betti numbers are b01x(a,b), bl~t(a,b), and b2~t(a,b), where the truncation type Ix is also specified. The results of the shape group analysis can be summarized using the Betti numbers. The distribution of various values of Betti numbers bP~t(a,b) as a function of the density threshold a and curvature parameter b is represented by various (a,b)-maps. Discretized versions of (a,b)-maps, in the form of shape matrices
~7~(a,b), serve as numerical shape codes for the molecules.
The total number of elements in the shape code matrix M(a, b) is
t = nanb '
(12)
602
where in this discretized version of the (a,b)-map, na and n b are the number of grid points for parameters a and b, respectively. Shape similarity between two molecules or molecular regions A and B can be expressed using the following shape-similarity measure:
s(A,B) = m[b~(a,b),A,/~(a,b),B ] [ t.
(13)
Here m[/~(a,b),A, ~(a,b),B ] is the number of matches between corresponding elements in the two shape code matrices
M(a,b),A and
~(a,b),B
of the
two molecules or molecular fragments A and B, respectively.
ELECTRON
DENSITY
INTERACTING
ANALYSIS
REACTIVE REGIONS
OF
ISOLATED
AND
OF M O L E C U L E S
The approach we shall follow in the study of host-guest interactions involves the comparison of electron density distributions of molecular regions in the presence and also in the absence of interactions. In this context, some aspects of molecular similarity analysis are applied, evaluating the similarities between two electron densities"
one where these regions are taken in isolation, and
another that includes the interactions between the host and guest regions of the molecules involved. The first model, describing the isolated region approach, can be derived easily from the AFDF principle. Molecular regions are described by fuzzy electron densities analogous to densities of complete molecules, and the local shape analysis of regions follows the same principles as the shape analysis of
603
complete molecules. In order to emphasize the fact that molecular regions are involved, we shall use the terminology "region isodensity contour" (RIDCO) surface, replacing the term "molecular isodensity contour" (MIDCO) surface. The notation
R
is used for the actual region selected for study and M'
denotes the rest of the macromolecule M. The rest of the molecule,
M', may
be composed from several regions, R 1, R2, .. 9 Rr-1, and without restriction on generality, the actual region R, the subject of our study, is assumed to correspond to the last region in the series, R = Rr. If the influence of the rest of the molecule on a molecular region R is unimportant, or, if one is interested in the electron density of a given region in the absence of interactions with other parts of the molecule, then the actual molecular density region R can be regarded as a separate entity. In this case, it is meaningful to consider RIDCO contours for R where the density threshold a
is compared only to the actual regional density
pR(r). In this model, the
RIDCOs are not influenced by the additional density contributions from the rest of the molecule M. In this model, a "non-interacting"
RIDCO for a region R of a molecule
M = RM' is defined as follows:
GR~I'(a) = { r" pR(r) = a, pR(r) > PRk(r), k=l .... m-1 }.
Alternatively, one may define a "non-interacting"
RIDCO
(14)
using the
following relations:
GR~VI'(a) = G R ( a ) n { r" pR(r)>--PRk(r), k=l,...m-1 },
or
(15)
604
GRXM'(a) = GR(a) \ { r" 3 kr { 1,...m-1 }" pR(r) < PRk(r) }.
(16)
These two alternative definitions, (15) and (16), are equivalent to the definition given by eq. (14). These definitions, as well as some of the properties of "non-interacting"
RIDCOs are analogous to those of "non-interacting"
functional groups, discussed in ref. [31]. The interpretation of
RIDCO GR\M'(a) is simple: for a region R in
macromolecule M=RM', GRLM'(a) is the set of all those points r where the electron density contribution
pR(r)
of region
R is dominant among all
regional electron densities PRk(r) within the macromolecular electron density pM(r). The standard Shape Group Method is applicable for the analysis of the entire series of non-interacting thresholds
RIDCOs,
for a whole range of density
a, with the provision of an additional domain type representing the
connection of region R to the rest of the molecule within the actual RM' system. This additional domain type D_ 1 is defined as
D-1 (GRkM'(a)) = { r" r e GR(a), 3 k~ { 1.... m-1 }" pR(r) < PRk(r)}. (17)
On the actual, non-interacting
RIDCO
surface
GR\M'(a)
only the
boundary AD_I (GRXM'(a)) of this additional domain can be found:
AD_I (GR~I'(a)) = { r" r e GR~M'(a), 3 k'a { 1,...m-1 }" pR(r) = PRk,(r), pR~(r) > PRk(r), k=l .... m-1 }.
(18)
605
The domain D-1 (GR\M'(a))
itself exists only on the intact GR(a) contour
surface. For a macromolecule RM', a typical domain D-1 (GRLM'(a)) appears only as a formal cover over a hole of the non-interacting RIDCO G RLM'(a). Following the treatment of non-interacting functional groups [31], simpler representation of region
R in molecule
RM'
a
is obtained if the
comparisons of regional electron densities are not carried out for each region, but the composite M' of all the remaining regions R1, R 2 , . . . ,
Rm-1
is
compared to the regional density pR(r). Using the notation PlVr(r) for the composite density of all the remaining regions,
PlVr(r) = PR~(r)+ PRz(r)+ ... + PRm.~(r),
(19)
the corresponding non-interacting RIDCO surfaces GRhEM'(a) are defined as
GR~M'(a) = { r: PR(r) = a, pR(r) > Plvr(r) }.
(20)
Using this approach, new local domain types appear at those locations of the molecular electron density where the region R connects to the rest M' of the molecule RM'= M:
D-1 (GR~M'(a)) : { r : r e GR(a), pR(r) < PlVr(r)}.
(21)
For the purposes of shape analysis, the boundaries of these additional domains are of importance. These boundaries are
AD_I (GR~M:(a)) = { r : r e GRkZM'(a), pR(r) = PMT(r)}.
(22)
606 By determining all points r
where p R(r) = PlVr(r), these boundaries can be
found with relatively little computational effort. In some instances, the interactions of various molecular regions in a macromolecule RM' or in a composite "supramolecule" of a host-guest system are of interest.
In these cases, a local shape analysis of the "isolated" RIDCO
surfaces GR(a) is no longer sufficient, and the study of the interactions requires the calculation of new density contours. A suitable definition for "interactive" RIDCO surfaces can be given as follows:
(23)
GR(M')(a) = {r- pR(r) + PlVr(r) = a, pR(r) > PlVg(r)}.
On these
GR(M')(a)
correspond to some formal
surfaces there are no domains which would covers
of holes in the RIDCO
GR(M')(a),
however, for consistency with the notations used in the case of non-interacting RIDCOs, the formal boundaries of the holes on
GR(M')(a) are denoted by
AD_ 1 (GR(M') (a))"
AD_I (GR(M')(a)) : { r" r e GR(M')(a), pR(r) = pM'(r)}.
(24)
The computation and shape analysis of the interactive RIDCOs
of a
region R in a macromolecule RM' requires the determination of additional contours. This implies that the shape analysis of interacting
RIDCOs is
computationally more expensive than that of the non-interactive
RIDCOs
GRLM'(a). For the suggested approach to the study of host-guest interactions, the shape analysis of both types of RIDCOs is required.
607 SIMILARITY
SHAPE
MEASURES
AND
DISSIMILARITY
MEASURES IN THE STUDY OF HOST-GUEST INTERACTIONS
Consider the interacting host-guest complex as a single supermolecule M, and denote the region of interaction in this complex by Rhg. The corresponding reactive regions of the isolated host and the isolated guest molecules are denoted by Rh, and Rg, respectively. The regional, fuzzy electron densities associated with the three regions, Rhg, Rh, and Rg, are denoted by
PRhg(r), P Rh(r),
and P Rg(r), respectively. From the latter two regional densities PRh(r), and PRg(r),
one may construct the composite density PRh§
of a formal,
superimposed but non-interacting host-guest region Rh+g,
PRh§
= PRh(r) + PRg(r).
(25)
The Shape Group analysis can be carded out for all these regional electron densities using both the non-interacting and the interacting RIDCO formalisms. We shall use the following notations: subscripts and superscripts nil, ni2, and i correspond to the choices of non-interacting RIDCOs G R\M'(a), noninteracting RIDCOs GRLEM,(a),
and the interacting RIDCOs
GR(M')(a),
respectively. Accordingly, the shape analysis can be carried out for contour surfaces following either one of the conventions nil, ni2, and i, leading to the shape code matrices i~(a,b),R,nil,
respectively.
M(a,b),R,ni2,
and
I~(a,b),R,i,
(26)
608
For each of these options, the similarity measures Snil(Rhg,Rh+g ) = m[/~(a,b),Rhg, nil,/~(a,b),Rh+g, nil] [ t,
(27)
Sni2(Rhg,Rh+g ) = m[~v~(a,b),Rhg,ni2, ~,~(a,b),Rh+g,ni2] / t,
(28)
si(Rhg,Rh+g) = m[~JI(a,b),Rhg,i, M(a,b),Rh+g, i] [ t,
(29)
and
and the associated dissimilarity measures dnil(Rhg~h+g)= 1 - Snil(Rhg,Rh+g)
(30)
dni2(Rhg,Rh+g)= 1 - Sni2(Rhg,Rh+g)
(31)
di(Rhg,Rh+g)= 1 - si(Rhg,Rh+g)
(32)
and
give indications of the extent, range and various details of the host-guest interaction. Among these measures, si(Rhg,Rh+g) and di(Rhg,Rh+g) reflect the most detail of the interactions between the host and guest regions, taking into account the influence of those molecular regions which participate only indirectly in the actual host-guest interaction. The greater the value of dissimilarity measure di(Rhg,Rh+g ), the greater the shape change induced by the host-guest interaction. Additional detail can be found in the shape code matrices 1~(a,b),Rhg, i and
i~(a,b),Rh+g,i themselves; their comparisons can be carded out with
special focus on low or high density ranges, considering a specific range of the
609 curvature parameter b. Whereas most shape changes in host-guest interactions are expected in the low density ranges of the fuzzy regional charge distributions, if significant changes occur in the high density ranges, this is a sign of exceptionally strong host-guest interactions.
Such details are not directly
available from the dissimilarity measure di(Rhg,Rh+g), however, one may diagnose such instances by a direct comparison of high density ranges within the shape code matrices M(a,b),Rhg, i and
~4~(a,b),Rh+g,i.
SUMMARY
The electron densities of local regions of both small and large molecules can be studied in detail using some of the macromolecular quantum chemical computational techniques developed recently. The shape analysis of host-guest systems and the comparison of the electron densities of interacting and noninteracting molecular regions provide measures and detailed descriptions of these interactions.
REFERENCES
[1]
W.G. Richards, Quantum Pharmacology, Butterworth, London, 1983.
[2]
M. Karplus and J.A. McCammon, Annu. Rev. Biochem. 53, 263 (1983).
[3]
P. De Santis, S. Morosetti, and A. Palleschi, Biopolymers 22, 37 (1983).
610
[4]
R. Franke, Theoretical Drug Design Methods, Elsevier, Amsterdam, 1984.
[5]
J.S. Richardson, Methods in EnzymoL 115, 359 (1985).
[6]
M.N. Liebman, C.A. Venanzi, H. Weinstein, Biopolymers, 24, 1721 (1985).
[7]
T. Kikuchi, G. N6methy, and H.A. Scheraga, J. Comput. Chem. 7, 67 (1986).
[8]
P.M. Dean, Molecular Foundations of Drug-Receptor Interaction, Cambridge University Press, New York, 1987.
[9]
F.M. Richards and C.E. Kundot, Protein Struct. Funct. Genet. 3, 71 (1988).
[10]
P.-L. Chau, and P.M. Dean, J. Computer-Aided Molecular Design, 8, 513, 527, 545 (1994).
[11]
R. Carb6, L.Leyda, and M. Arnau, Int. J. Quantum Chem. 17, 1185 (1980).
[12]
R. Carb6 and L1. Domingo, Int. J. Quantum Chem. 32,517 (1987).
[13]
R. Carb6 and B. Calabuig, Comput. Phys. Commun. 55, 117 (1989)
[14]
R. Carb6 and B. Calabuig, Int. J. Quantum Chem. 42,1681, 1695 (1992).
[15]
R. Carb6, E. Besal6, B. Calabuig, and V. Vera, Adv. Quant. Chem. 25, 253 (1994).
[16]
E. Besal6, R. Carb6, J. Mestres, and M. Sol~, Foundations and Recent Developments on Molecular Quantum Similarity, in Topics in Current Chemistry, Vol. 173, Molecular Similarity, ed. K. Sen
(Springer-Verlag, Heidelberg, 1995). [17]
E.E. Hodgkin and W.G. Richards, J. Chem. Soc. Chem. Commun. 1986, 1342 (1986).
611
[18]
E.E. Hodgkin and W.G. Richards, Int. J. Quantum Chem. 14, 105 (1987).
[191
A. Good, and W.G. Richards, J. Chem. Inf. Comp. Sci. 33, 112 (1992).
[20]
S. Leicester, R. Bywater, and J.L. Finney, J. Mol. Graph. 6, 104 (1988).
[21]
R.P. Bywater, Quantitative Measurement of Molecular Similarity Using
Shape Descriptors, R. Carb6, Ed.; Molecular Similarity and Reactivity: From Quantum Chemical to Phenomenological Approaches; Kluwer Academic Publ.- Dordrecht, The Netherlands, 1995, pp 113-122. [221
M.A. Johnson, and G.M. Maggiora, Eds., Concepts and Applications of
Molecular Similarity, Wiley, New York, 1990. [23]
C.-D. Zachman, M. Heiden, M. Schlenkrich, and J. Brickmann, J.
Comp. Chem. 13, 76 (1992). [24]
C.-D. Zachman, S.M. Kast, A. Sariban, and J. Brickmann, J. Comp.
Chem. 14, 1290(1993). [25]
D.L. Cooper, and N.L. Allan, Molecular Similarity and Momentum
Space,
R. Carb6, Ed.; Molecular Similarity and Reactivity: From
Quantum Chemical to Phenomenological Approaches; Kluwer Academic Publ.: Dordrecht, The Netherlands, 1995, pp 31-55. [26]
S. Anzali, G. Bamickel, M. Krug, J. Sadowski, M. Wagener, and J. Gasteiger, Evaluation of Molecular Surface Properties Using a
Kohonen Neural Network, J. Devillers, Ed.; Neural Networks in QSAR and Drug Design; Academic Press: London, 1996, pp 209-222. [27]
P.G. Mezey, Shape in Chemistry: An Introduction to Molecular Shape
and Topology, VCH Publishers, New York, 1993. [28]
P.G. Mezey, "Density Domain Bonding Topology and Molecular Similarity Measures". In K. Sen, Ed., Topics in Current Chemistry, Vol. 173, Molecular Similarity, Springer-Vedag, Heidelberg, 1995.
612
[29]
P.G. Mezey, "Methods of Molecular Shape-Similarity Analysis and Topological
Shape Design". In P.M. Dean, Ed., Molecular Similarity
in Drug Design, Chapman & Hall- Blackie Publishers, Glasgow,
U.K., 1995. [301
P.G. Mezey, "Shape Analysis of Macromolecular Electron Densities", Structural Chem., 6, 261 (1995).
[31]
P.G. Mezey, "Functional Groups in Quantum Chemistry", Advances in Quantum Chemistry, 27, 163 (1996).
[32]
P.G. M ezey, "Local Shape Analysis of Macromolecular Electron Densities".
In J. Leszczynski, Ed. Computational Chemistry:
Reviews and Current Trends, Vol. 1, World Scientific Publ., Singapore,
1996. [33]
P.D. Walker, and P.G. Mezey, J. Am. Chem. Soc., 115, 12423 (1993).
[34]
P.D. Walker, and P.G. Mezey, J. Am. Chem. Soc., 116, 12022 (1994).
[35]
P.D. Walker, and P.G. Mezey, Canad. J. Chem., 72, 2531 (1994).
[36]
P.D. Walker, and P.G. Mezey, J. Math. Chem., 17, 203 (1995).
[37]
P.D. Walker, and P.G. Mezey, J. Comput. Chem., 16, 1238 (1995).
[38]
P.G. Mezey, Z. Zimpel, P. Warburton, P.D. Walker, D.G. Irvine, D. G. Dixon, and B. Greenberg, J. Chem. Inf. Comp. Sci., 36, 602 (1996).
[39]
P.G. Mezey, J. Math. Chem., 18, 141 (1995).
[40]
P.G. Mezey, "Molecular Similarity Measures of Conformational Changes and Electron Density Deformations", Advances in Molecular Similarity, 1, 89 (1996).
[41]
P.G. Mezey, Int. J. Quantum Chem., 63, 39 (1997).
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modem Modelling of the Chemical Bond Theoretical and Computational Chemistry, Vol. 6
613
9 1999 Elsevier Science B.V. All rights reserved.
Chemical Bonding in Proteins and Other Macromolecules Paul G. Mezey Mathematical Chemistry Research Unit, Department of Chemistry and Department of Mathematics and Statistics University of Saskatchewan, 110 Science Place, Saskatoon, SK, Canada, S7N 5C9
ABSTRACT
In highly folded macromolecules, such as most globular proteins, the complex pattern of non-bonded interactions between various molecular fragments plays an important stabilizing role. The change of this pattern in the course of various conformational rearrangements may enhance or hinder the conformational processes. The AFDF (additive fuzzy density fragmentation) method of ab initio quality electron density computations for proteins is a tool that provides new insight into these interactions. Based on such calculations, in this contribution a new model, the "Low Density Glue" (LDG) bonding model, is presented that approaches these interactions from a global perspective.
614
1.
INTRODUCTION
Linus Pauling's unprecedented successes were almost always based on a clear recognition of the essential in seemingly very complex problems, leading to simple yet effective solutions. This is not a simple path to follow. In this study an attempt is made to discuss some bonding features common to many proteins. A family of new computational methodologies provides the means for a new approach to study these bonding problems, however, a clear simplification in the Pauling tradition requires additional insight. Electron density is the basis of chemical bonding [1],
and density
functional theory has established that electron density determines the energy contents of molecular arrangements [2,3] as well as other molecular properties [4-10]. Whereas these facts clearly indicate the central role of electron density, for large molecules, electron density analysis faces many difficulties. Proteins show structural complexities which involve local order and important regularities, as well as equally significant irregularities and seemingly accidental, disordered features. Chemical bonding in proteins exhibits a whole range of features not found in small molecules, yet these very features appear essential in the roles proteins play in biochemistry.
There are many open
problems concerning the relative roles of these features and the collective effects of formal non-bonded interactions. Until recently, detailed theoretical studies of global bonding features of proteins were hindered by the lack of detailed enough experimental or theoretical electron density maps for proteins. crystallographic
X-ray
Recently,
structure determination methods for proteins have
improved dramatically, but even with these improvements, the resolution of the observed electron density often leaves some uncertainties concerning the precise location of hydrogen nuclei. For a detailed experimental analysis of chemical
615
bonding, electron densities of higher experimental resolution than those currently available are required. However, recent progress in macromolecular quantum chemistry and advanced molecular modeling has provided some new computational tools which can be used to address some of these questions of macromolecular bonding. The introduction of
AFDF
(additive fuzzy density fragmentation)
methods for the study of functional groups [ 11-13] and the computation of ab initio
quality electron densities for proteins and other macromolecules have
naturally led to a renewed search for regularities in the quantum chemical descriptors of chemical bonding in large systems.
In this contribution a
particular aspect of chemical bonding is discussed that becomes especially important in large, self-interacting systems. The interactions between various segments of a protein which are separated by several amino acids along the polypeptide chain but which fall within a short geometrical distance from one another due to the actual folding pattern, are expected to contribute to the stability of the protein conformation to a significant degree. The importance of formal "non-bonded" interactions in proteins is well recognized, however, these interactions are usually considered as local effects, between well-defined structural entities, such as hydrogen bonds between a given pair of electronegative elements formally "sharing" a proton. In view of the latest results of electron density calculations of proteins and other large molecules, an alternative approach is suggested. In this new model, the "Low Density Glue"
(LDG) bonding model, the formal non-bonded interactions
often merge into broader interactions between interfaces that are extensive and blur the distinction between local interactions conventionally considered as separate effects.
616
0
MACROMOLECULAR QUANTUM CHEMISTRY BASED ON ADDITIVE FUZZY DENSITY FRAGMENTATION (AFDF)
The
AFDF
family of methods and the quantum chemical treatment of
functional groups have been reviewed recently [11].
Here only a brief
introduction of the notations and a summary of the basic results will be given. At each point
r, the electronic density p(r,K) of a molecule of nuclear
conformation K can be computed by the Hartree-Fock-Roothaan-Hall SCF LCAO ab initio method. Using a basis set tp(K) of atomic orbitals tpi(r,K) (i=l,2,...,n) and the nxn dimensional density matrix P(tp(K)), the electronic density p(r,K) is obtained as
n
n
p(r,K) = E E Pij(tp(K)) tpi(r,K) tpj(r,K).
(1)
i=l j=l
Electron density decreases exponentially with distance that suggests that an Additive Fuzzy Density Fragmentation (AFDF) approach can be used for both a fuzzy decomposition and construction of molecular electron densities. The simplest AFDF technique is the Mulliken-Mezey density matrix fragmentation [12,13],
that is the basis of both the Molecular Electron Density Loge
Assembler (MEDLA) [14-17] and the Adjustable Density Matrix Assembler (ADMA) [ 18-21] macromolecular quantum chemistry methods. Within the general Mulliken-Mezey AFDF approach, the set of nuclei of the molecule M are classified into m mutually exclusive families
617
fl, f2,- -., fk,. --, fm,
and for each AO basis function q)i(r,K) and nuclear family fk a membership function mk(i) is defined:
1 if q)i(r) is centered on one of the nuclei of set fk, mk(i) = { 0 otherwise.
(2)
In terms of some wij and wji weighting factors fulfilling the condition
w ij + wji = 1,
wij, wji > 0,
(3)
the elements Pkij(q)(K)) of the n x n fragment density matrix pk(q)(K)) for the k-th fragment are given by
Pkij(tp(K)) = [mk(i)w ij+ mk(j) wji ] Pij(9(K)).
(4)
In the simplest case of the Mulliken-Mezey AFDF approach, the
wij = wji = 0.5
(5)
choice is used, that follows the spirit of the population analysis scheme of Mulliken. The k-th additive fuzzy density fragment pk(r,K) is defined as
618
n
ok(r, K) =
n
]~ E Pkij(~K)) ~(r,K) ~j(r,K), k=l,2,...m. i=l
(6)
j=l
As it can be easily verified, the
AFDF
fragment density matrices
Pk(q0(K)) as well as the density fragments pk(r,K) are strictly additive:
in
P(~K)) =
]~ pk(q)(K)),
(7)
k=l and in
p(r,K)=
]E pk(r,K).
(8)
k--1
If the molecule M is small, then a conventional Hartree-Fock computation followed by the application of the
AFDF
approach allows one to study the
shape and the interactions of local moieties of M in detail. If the molecule M is large, then a conventional Hartree-Fock computation is no longer feasible, however, the fuzzy electron density fragments of the large "target"
molecule
M
still can be computed indirectly using the AFDF
approach. For each nuclear family fk
of M, a small parent molecule Mk
can be designed, where M k contains the same nuclear family
fk with the
same local arrangement and surroundings as is found in the large target molecule M. The fuzzy density fragmentation can be carried out for the small parent molecule M k, resulting in a fuzzy density fragment p k(r,K) corresponding to the nuclear set fk , and other density fragments which are not used for the
619
macromolecular study. By repeating this procedure for each nuclear family fk of M, the fuzzy fragments
p l(r,K), p 2 ( r , K ) , . . . , p k ( r , K ) , . . . , pm(r,K)
obtained from a set of m small "parent" molecules
M1, M 2 , . . . , M k , . . . , Mm,
can be combined and used to construct the electron density p (r,K) of the large target molecule M. The fragment densities themselves can be used for local analysis of the macromolecule M. The
MEDLA
(Molecular Electron Density "Loge" Assembler, or
Molecular Electron Density "Lego" Assembler) method of Walker and Mezey [14-17] was the first implementation of the simplest version (5) of the AFDF approach.
The MEDLA method is based on a numerical electron density
fragment database of pre-calculated, custom-made electron density fragments 9k(r,K), and a subsequent numerical construction of the molecular electron density using eq. (8). According to detailed tests [14,15,17], the MEDLA method generates ab initio quality electron densities for large molecules near the 6-31G** basis set level that has been used for the construction of the fragment density databank.
For the first time, ab initio quality electron
densities have been computed for several proteins, including crambin, bovine insulin, the gene-5 protein (g5p)of bacteriophage M13, the HIV-1 protease monomer of
1564 atoms, and the proto-oncogene tyrosine kinese protein
1ABL containing 873 atoms.
620
The requirement of a numerical databank and some of the problems associated with the grid alignment of combined numerical density data are circumvented in a more advanced application of the AFDF approach relying directly on the fragment density matrices
Matrix Assembler
(ADMA)
density matrix P(~p(K))
method [18-21] generates a macromolecular
that can be used for the computation of a variety of
molecular properties besides densities.
Pk(cP(Kk)). The Adjustable Density
ab initio
quality
macromolecular electron
In electron density computations the accuracy of
macromolecular density matrix P(q~(K))
the
ADMA
corresponds to that of a MEDLA
result of an infinite resolution numerical grid. The construction of the macromolecular density matrix is the simplest if the fragment density matrices pk(q~(Kk) ) obtained from small parent molecules M k fulfill the following mutual compatibility requirements: (a) The local coordinate systems of AO basis sets of all the fragment density matrices pk(q~(Kk) ) have axes that are parallel and have matching orientations with the axes of a common reference coordinate system defined for the macromolecule. (b) The nuclear families used in the fragmentation of both the target and the parent molecules are compatible in the following sense: each parent molecule Mk
may
families
contain only complete nuclear families from the sets of nuclear fl, f2 . . . .
, fk . . . .
, fm,
present in the large target molecule M.
Within each parent molecule additional nuclei may be involved in order to provide linkages to dangling bonds at the peripheries of these molecules. A simple similarity transformation of a fragment density matrix pk(q~(Kk)) using a suitable orthogonal transformation matrix T(k) of the AO sets, and an appropriate choice of nuclear families fk for the various fragments within the
621
macromolecule M and within the "coordination shells" of parent molecules Mk can always ensure the fulfillment of these conditions. The AFDF approach enhanced with these mutual compatibility conditions is referred to as the MC-AFDF approach. The number of AOs in the nuclear family fk of the target macromolecule M is denoted by n k. For each pair (fk, fk') of nuclear families a quantity Ck'k is defined: 1, if nuclear family fk' is present in parent molecule Mk Ck'k = { 0 otherwise,
(9)
An AO q0(r) is denoted by the symbol q0b,k'(r) if its serial number b in the AO set nk' { qOa,k'(r) }
(10)
a=l of nuclear family fk' is emphasized. The same AO q0(r) is denoted by q0jk(r), if its serial index j in the basis set npk { qoik(r) }
(11)
i=l of the k-th fragment density matrix Pk(q0(Kk)) is emphasized, where the total number of these AOs is npk,
622
m
(12)
npk - ]~ Ck'k n k ' . k'=l
The same AO
r
is denoted by r
if its serial index
y in the AO set
n
{ q~x(r) }
(13)
x=l of the density matrix P(K) where for each AO
of the target macromolecule
q~a,k'(r) = ~ k ( r ) - q~x(r)
M is emphasized,
the index x is determined by
the index a in family k' as follows:
k'- 1
x -
x(k',a,f) - a + Z
(14)
n b,
b=l
The last entry f in x(k',a,f) indicates that k' and a refer to a nuclear family. In order to be able to determine the index x from the element index i and serial index
k of fragment density matrix
Pk(tp(Kk)), three quantities are
introduced for each index k and nuclear family fk" for which Ck"k ~ 0 9
k
l!
a'k(k",i) = i + ]~ nb Cbk ,
(15)
b=l
k ' - k'(i,k) = min {k"" a'k(k",i) < 0 },
(16)
623
and (17)
ak(i) = a'k(k',i) + nk'.
The AO index x - x(k,i,P) in the density matrix P(K) of target molecule M
depends on indices i and k and can be expressed using index k' and the
function x(k',a,f)
x-
(18)
x(k,i,P) = x(k',ak(i),f),
where the last entry P in the index function x(k,i,P)
indicates that k and i
refer to the fragment density matrix pk(q)(Kk)). Using only the nonzero elements of each (usually rather sparse) fragment density matrix
pk(q)(Kk)),
the macromolecular density matrix
P(K)
is
assembled by an iterative procedure,
Px(k,i,P),y(k,j,p)(K)- Px(k,i,P),y(k,j,P)(K) + Pkij(Kk)
Since the parent molecules
Mk
(19)
are of limited size, the entire procedure
depends linearly on the number of fragments and on the size of the target macromolecule M. The macromolecular density matrix P(K)
is also a sparse matrix that
simplifies its storage and subsequent computations. Using the macromolecular AO basis (that is stored as a list of appropriate indices referring to a standard list of AO
basis sets),
the macromolecular electron density is computed
according to eq. (1). Using the ADMA method, approximate macromolecular
624 forces and other properties expressible in terms of density matrices can be computed for virtually any molecule, providing a computationally viable approach to macromolecular quantum chemistry. One aspect of this field is discussed in the next section.
3.
" L O W DENSITY GLUE" (LDG) BONDING IN PROTEINS
In large, folded chain molecules an important, low-density component of self-interactions contributes to the structural features of the molecule in a fundamentally different way than formal chemical bonds usually assigned to pairs of atomic nuclei.
These low-density contributions are more widely
distributed within the macromolecule, cannot be assigned to individual atom pairs and are better described as interactions between larger structural elements, such as chain fragments. The range of electron density that is typically involved belongs to the fuzzy, peripheral electron density cloud, and the mutual interpenetration of these fuzzy "clouds" can be used to detect these interactions. Alternatively, one may consider this low-density cloud as a formal, diluted "glue", that still has a non-negligible role in holding the various molecular structural entities together. It is natural to use fuzzy set methods [22-26] for the study of these fuzzy electron distributions [27]. The formal "bodies" of molecules do not have boundaries and the actual shape of molecules is determined by the fuzzy electron distribution. Realistic models describing molecular shapes and chemical bonding must reflect this natural fuzziness [27].
625
One approach involves the following question: to what extent do various points r of the three-dimensional space belong to a single, isolated molecule X? Such problems are typically addressed using fuzzy sets [22-27]. Consider a spatial domain D containing the nuclei of molecule X, and let Pmax denote the maximum value of the electron density within D:
(20)
Pmax = max {p(r), r ~ D}.
In terms of
Pmax,
a fuzzy membership function ~ x ( r ) is defined for points
r of the space, expressing the "degree" of their belonging to molecule X:
(21)
l-tx(r) - p (r)/Pmax-
In a similar spirit, if molecule X is not isolated, then a point r may belong to several different molecules to different degrees, that can also be expressed using fuzzy membership functions. In particular, the total electron density p(r) at some point r can be regarded as a sum of electron densities p y ( r ) attributed to individual molecules X
X
and
Y
and
and Y, respectively. In this case,
the fuzzy membership functions of various points r molecules
px(r)
with respect to the two
are determined by the relative magnitudes of the
individual electron densities px(r)
and P Y(r)-
In the general case, if
PxI(r), Px2(r) . . . .
p x j ( r ) , . . . PXm(r),
are the electron density contributions of individual molecules
(22)
626
(23)
X1, X2, . . . Xi, . . . Xm,
respectively, from a molecular family L of several molecules, then these individual electron densities can be used to represent the "share" in the total electron density of the molecular family L and to define the appropriate fuzzy membership functions The
"share" PXi(r) of each individual molecule Xi
as a part of the
complete electron density, can also be considered in the absence of all other molecules, as a separate, individual fuzzy object. Within some domain
DXj
of the space containing all the nuclei of
molecule Xi the maximum value Pmax,i of the electron density
PXa(r) is
Pmax,i = max { Pxi(r), r s DXi }.
(24)
We may select a point rmax, i where this maximum density value P max,i is realized for the given molecule"
Pxa(rmax,i) = Pmax,i 9
A fuzzy membership function for points
r
(25)
of the space belonging to
molecule Xi (regarded now in the absence of all other molecules) is
~txj(r) = PXj(r)/Pmax,i 9
(26)
By contrast, if all other molecules of the family L are also considered to be present, then each molecule may have some partial "claim" for each point r of
627
the three-dimensional space, that is, the actual "degree of belonging" of a point r to a given molecule X i is influenced by the electron density contributions PXl(r), PX2(r), . . . PXa(r), 9 9 9PXm(r) of all molecules. In this case, the total electron density pL(r) of the molecular family X 1, X2, 9 9 9 Xj . . . .
X m taken at point r is of importance,
pL(r) = Ej pxj(r),
(27)
since individual fuzzy memberships are defined relative to this total density. In particular, the fuzzy membership function ~tXi,L(r) for points r of the space belonging to molecule Xi of the family L is given by
l-tXi,L(r) = ~xi(r) [Pmax,i / pL(rmax,i)]-
(28)
The ratio [Pmax,i / pL(rmax,i)] used in this expression is a scaling factor that provides a proportional treatment of the actual density contributions from various molecules of the family L. An alternative, equivalent expression for the fuzzy membership function txXJ,L(r) is givenby
laXa,L(r) - laXi(r) [Pmax,i / pL(rmax,i)]
= [PXj(r)/Pmax,i ] [Pmax,i / pL(rmax,i)]
= px~(r) / pL(rmax,i) 9
(29)
628 These fuzzy electron density membership functions properly describe the mutual interpenetration of fuzzy electron density clouds within the molecular family L, and it provides a description of how molecules share some common regions of space. The fuzzy, low density glue aspect of bonding in proteins can be modeled using a rather simple approach. In tightly folded arrangements of chain molecules, the formal space filling characteristics are manifested by the merger of electronic density clouds between molecular parts which are not linked directly by formal chemical bonds. Of course, these mergers also contribute to bonding and to the stability of the actual folded pattern.
For proteins and large polypeptides the accumulated
computational experience suggests [15,16]
that the AFDF electron densities
exhibit such non-bonded mergers at many locations within the molecule occurring approximately at the same density threshold am. The recognition of this trend is likely to assist in the search for stable conformations of proteins, in the study of side chain arrangements and in predicting folding patterns. The simplest implementation of this idea is the Self-Avoiding MIDCO approach, proposed as a simple, approximate method for conformation analysis of biopolymers [16]. Consider a threshold value am that corresponds to the onset of most "non-bonded" mergers of MIDCOs G(K, am). Using the AFDF methods, the electron density can be computed for a family R of a large number of nuclear configurations K of the macromolecule. One may test the MIDCOs G(K, am+Aa), G(K, am), and G(K, am-Aa)
for the selected threshold value
am and density increment Aa for each nuclear configuration K of family R. The self-avoiding MIDCO method identifies those nuclear configurations K from this family R which show favorable "non-bonding" interactions:
629
A given configuration K is selected if the non-bonded mergers of density contours which appear for the lower density MIDCO G(K, am-Aa) do not appear yet in the higher density MIDCO G(K, am+Aa). In practice, a value of am falling within the range [0.003 a.u., 0.005 a.u.] of density thresholds and a density increment of
Aa -- 0.001 a.u. appear as a
suitable choice. It is important to realize that the Self-Avoiding MIDCO approach is not a fuzzy set version of a hard surface contact model. If
various parts of a
macromolecule are placed side by side, then the electronic density
charge
clouds mutually enhance each other due to their partial overlap, resulting in an actual shape change of these electron density clouds. The various MIDCOs G(K,a)
experience significant swelling due to this overlap. The merger of the
local parts of the MIDCO actually occurs at a point r outside of each individual MIDCO
that would fall on the
part without the presence of the other
MIDCO part. Consequently, this rather simple, Self-Avoiding MIDCO method incorporates some aspects of the non-bonded interactions resulting in a shape change of the MIDCO surfaces. A more precise and also more detailed analysis is possible if one considers the "Low Density Glue" (LDG) part of the electron distribution, defined as the object
LDG(K, am, Aa) = DD(K, am-Aa) \ DD(K, am+Aa),
where
DD(K, am-Aa)
and
DD(K, am+Aa)
(30)
are the density domains
associated with the MIDCOs G(K, am-Aa) and G(K, a m+Aa), respectively. Using the formalism of fuzzy set theory [26], the "low density glue" object
630
LDG(K, am, Aa) can be thought of in terms of a-cuts, where two cuts are made, with the two values of o~ - am-Aa and o~ - am+Aa. Typically, the object LDG(K, am, Aa) contains at least one hollow interior cavity which is the high density range of the molecule, describing most of the nuclear neighborhoods and the pattern of conventional chemical bonds. For the purposes of our current study, the geometrical and topological features exhibiting multiple connected features are of special interest. The mergers of electron density clouds due to non-bonded interactions are manifested in the fact that the object
LDG(K, am, Aa) is multiply connected.
In fact, the simplest type of connectedness property is the most relevant: arcwise multiple connectedness, or 1-connectedness. In algebraic topology, the one-dimensional homotopy group, also called the fundamental group of the object, provides a concise description of
1-
connectedness. Here we are able to rely on an analogy with another chemical problem that has been already studied in some detail [28], and provides a homotopy group alternative to the homology group characterization of molecular shape [29]. An important application of homotopy groups in chemistry is the study of interrelations among reaction mechanisms. If an upper bound A
is
taken for energy, then the part F(A) of the potential energy hypersurface E(K) that falls below this energy bound represents all molecular species and all their interconversion processes (reactions) which are accessible below this energy bound.
The family of energy-dependent reaction mechanisms which are
realizable below this energy bound A
form an algebraic group, the one-
dimensional homotopy group I-II(F(A)) of the potential energy hypersurface level set F(A), A
for
that is, the potential surface truncated at the given upper bound
energy.
This group
I-I I(F(A))
is the fundamental group of the
631
truncated potential energy hypersurface, and it is referred to as the fundamental
group of reaction mechanisms at the given energy bound A [29]. The fundamental group 1-II(LDG(K, am, Aa))
of the "low density glue"
part of the macromolecular electron density is defined analogously to the fundamental group of potential energy hypersurface level set F(A), as the onedimensional homotopy group of the object LDG(K, am, Aa). One-dimensional homotopy groups describe the patterns of loops which are not contractible into one another within the object LDG(K, am, Aa), hence I-[I(LDG(K, am, Aa)) describes the arcwise-connectedness of LDG(K, am, Aa). Note that arcwise-connectedness differs from connectedness manifested in the contractibility of spherical surfaces; as it has been pointed out above, the "low density glue" part of the macromolecular electron density has a hollow interior, hence not all spherical surfaces within
LDG(K, am, Aa)
are
contractible to a point. For example, the inner wall of the cavity can be regarded as a topological sphere, and this sphere cannot be contracted to a point without "leaving" the body of object LDG(K, am, Aa). Typically, within an object LDG(K, am, Aa) of a macromolecule M,
there are two types of spherical
surfaces, those deformable into the interior wall of the cavity and those contractible to a point. These surfaces and their homotopy equivalence classes are described by the two-dimensional homotopy group I-[2(LDG(K, am, Aa)) of the object LDG(K, am, Aa). This is a relatively simple group, a free group with a single generator. In most cases, the types of one-dimensional loops with homotopicaily different contractibility properties within LDG(K, am, Aa) are more numerous, and the fundamental group generators.
1-II(LDG(K, am, Aa))
has a large number of
632
Similarities in the low density bonding contributions in proteins can be studied by comparing the fundamental groups I-I I(LDG(K, a m, Aa)). If for two proteins, or for two different folding patterns K and K' of the same protein the two fundamental groups, as abstract groups agree,
I-I I(LDG(K, am, Aa)) = I-II(LDG(K', am, Aa)),
(31)
then their low density bonding contributions exhibit a well-defined similarity. In this case, the two conformations, K and K' are regarded LDG-homotopically equivalent, that can be expressed in the notation
K---- K'
(32)
Further specifications are possible even if the two fundamental groups do not agree, by considering the group-subgroup relations among all possible LDG fundamental groups for the given proteins. The problem is fully analogous to the
characterization of the family of fundamental groups of reaction
mechanisms [29] that has been given in terms of a lower semilattice.
The
derivation will not be repeated here; the same method can be adapted for the fundamental groups
I]I(LDG(K, am, Aa)).
A hierarchy of the
LDG
fundamental groups, organized into a lower semilattice, provides the basis for comparisons of the low density bonding patterns in proteins.
633
REFERENCES
[1]
L. Pauling, The Nature of the Chemical Bond, Comell Univ. Press, Ithaca, 1960.
[2]
P. Hohenberg, and W. Kohn, Phys. Rev., 136, B864 (1964).
[3]
W. Kohn, and L.J. Sham, Phys. Rev., 140, A1133 (1965).
[4]
R.G. Parr, Proc. Natl. Acad. Sci. USA, 72, 763 (1975).
[5]
M. Levy, Phys. Rev. A, 26, 1200 (1982).
[6]
A. Becke, Phys. Rev. A, 33, 2786 (1986).
[7]
P. Politzer, J. Chem. Phys., 86, 1072 (1987).
[8]
D. R. Salahub, Adv. Chem. Phys., 69, 447 (1987).
[9]
E.S. Kryachko, and E.V. Ludena, Density Functional Theory of ManyElectron Systems, Kluwer, Dordrecht, 1989.
[~o]
T. Ziegler, Chem. Rev., 91, 651 (1991).
[11]
P.G. Mezey, "Functional Groups in Quantum Chemistry". Advances in Quantum Chemistry, 27, 163-222 (1996).
[12]
P.G. Mezey, "Shape Analysis of Macromolecular Electron Densities", Structural Chem., 6, 261 (1995).
[13]
P.G. Mezey, "Density Domain Bonding Topology and Molecular Similarity Measures". In K. Sen, ed., Topics in Current Chemistry, Vol. 173, Molecular Similarity, Springer-Verlag, Heidelberg, 1995.
[14]
P.D. Walker, and P.G. Mezey, J. Am. Chem. Soc., 115, 12423 (1993).
[15]
P.D. Walker, and P.G. Mezey, J. Am. Chem. Soc., 116, 12022 (1994).
[16]
P.D. Walker, and P.G. Mezey, J. Math. Chem., 17, 203 (1995).
[17]
P.D. Walker, and P.G. Mezey, J. Comput. Chem., 16, 1238 (1995).
[18]
P.G. Mezey, J. Math. Chem., 18, 141 (1995).
634
[19]
P.G. Mezey, "Molecular Similarity Measures of Conformational Changes and Electron Density Deformations", Advances in Molecular Similarity, 1, 89 (1996).
[20]
P.G. Mezey, Int. J. Quantum Chem., 63, 39 (1997).
[21]
P.G. Mezey, Int. Rev. Phys. Chem., in press (1997).
[22]
L.A. Zadeh, Inform. Control, 8, 338 (1965).
[23]
L.A. Zadeh, J. Math. Anal. Appl., 23, 421 (1968).
[24]
A. Kaufmann, Introduction glla Th~orie des Sous-Ensembles Flous, Masson, Paris, 1973.
[25]
L.A. Zadeh, "Theory of Fuzzy Sets". In Encyclopedia of Computer Science and
[26]
Technology, Marcel Dekker, New York, 1977.
G.J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic, Theory and Applications, Prentice-Hall, Englewood Cliffs, NJ 1995.
[27]
P.G. Mezey, "Fuzzy Measures of Molecular Shape and Size", in Fuzzy Logic in Chemistry, Ed. D.H. Rouvray, Academic Press, San Diego, 1997, pp 139-223.
[28]
P.G. Mezey, Potential Energy Hypersurfaces, Elsevier, Amsterdam, 1987.
[29]
P.G. Mezey, Shape in Chemistry: An Introduction to Molecular Shape and Topology, VCH Publishers, New York, 1993.
635
Figure 1. HIV Protease monomer electron density by the AFDF method. J. Math. Chem. 17, 203 (1995).
636
Figm'e 2. AFDF electron density of Proto-oncogene Tyrosine Kina:se Protein 1ABL. Drug Discovery Today, 2,132 (1997).
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modern Modelling of the Chemical Bond
637
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Models for Understanding and Predicting Protein Structure Dale F. Mierke Gustaf H. Carlson School of Chemistry, Clark University, 950 Main Street, Worcester, Massachusetts, 01610, USA email: [email protected] Department of Pharmacology and Molecular Toxicology, University of Massachusetts, Medical Center, 55 Lake Avenue, North, Worcester, Massachusetts 01655, USA
Introduction The generation of protein sequences (or equivalently the DNA sequences which provide the protein sequence) is accelerating at a great pace. In the foreseeable future the complete human genome will be sequenced. The challenge for the protein chemist is to assimilate and utilize this information [ 1]. Given the strong correlation between structure and function, the question is therefore the determination of structure from the primary protein sequence. Over the years great advances have been made in the experimental determination of protein structure. In the field of nuclear magnetic resonance (NMR), the increasing magnetic field strength, additional radio-frequency channels and pulse-field gradients, and the use of isotopic-enriched proteins including enrichment of 13C, 15N, and partial deuteration have expanded the range of proteins that can be investigated, including the size of the protein and the conditions (pH, temperature) under which they can be investigated [2]. The data analysis, the slowest step in the structure determination, has been greatly facilitated by advances made in automation (signal identification and assignment), a product of greater computational power and algorithm development. In x-ray crystallography advances have been made in both the collection and analysis of the data (cryo-diffraction, determination of phase), also a beneficiary of greater computer power and the development of novel algorithms. However, the generation of suitable crystals is a bottleneck in the rate at which protein structures can be determined. Given the difference in the current rate of generation of protein sequences and the experimental determination of structure, there is a vital role for theory. In addition, the experimental methods listed above are not generally suitable for the analysis of proteins within membrane environments, an area of investigation in which theoretical methods have made the most progress. A large number of transmembrane receptors (proteins which pass through the cellular membrane, often a number of times), vital for signal transduction from the extracellular domain into the cell, have been identified. In addition, an ever increasing
638
number of proteins associated with cellular membranes through palmitoylation or other lipid moiety have been isolated. In this chapter, some methods currently employed in the prediction of tertiary structure from the protein sequence will be highlighted. This is not intended to be a comprehensive review of the literature, a goal that would require an entire book of this size and would quickly become outdated. Instead, the aim is to provide some of the basic tenets of the theories and methods utilized with a few select references to the original literature. Hopefully this will serve as a starting point to delve into this very exciting and ongoing field of research.
Background The goal of protein-structure prediction is to derive the tertiary structure of the protein (defined as the manner in which the protein is bent or folded in three dimensions) given the sequence of amino acids (referred to as the primary structure). In between the primary and tertiary structure is the secondary structure which consists of regularly recurring arrangements of the protein chain in one-dimension (i.e., a-helices and 13-sheets).
"
\
.o3 ') i
Figure 1: Definitions of primary (left), secondary (center), and tertiary structures (right).
639 This nomenclature may also describe the sequence of events in the folding process: the primary sequence adopts secondary structural elements which then fold into the correct tertiary structure. Pauling first postulated that the hydrogen bond played a large role in the folding process. The importance of the hydrogen bond in stabilizing the secondary-structural elements, or-helices and [3-sheets, was quite clear from his earlier work [3-5].
C.....~
(-i
")
Figure 2" Illustration of the hydrogen bonding networks in the secondary structural elements (left) or-helixand (right) 13-sheet. It was thought that the folding of the protein was a rearrangement to find the fold with the maximum number of hydrogen bonds with the correct geometry. And indeed, analysis of typical protein structures provides a large number of hydrogen bonds both within and between the secondary structural elements. However, the pendulum was about to swing away from the hydrogen bond and towards something called the hydrophobic effect [6]. The typical protein sequence contains a large number of hydrophobic amino acids. When these amino acids reach a specific concentration the protein collapses into a globular conformation, similar to the critical micellar concentration of detergents. This state is the result of a collapse of the protein driven by the minimization of the interaction of the hydrophobic amino acids with the aqueous environment. This condensed state is fluid in that the polymer is not in one well-defined conformation but sampling many different conformations until eventually the native conformation with the lowest free energy is found. This, usually ill-defined, state is referred to as the molten globular state. The manner in which
640 the protein quickly locates the correct fold, the global minimum, in the presence of the large number of possible configurations within the molten globule is commonly known as Levinthal's paradox [7]. Great insight into the sequence of events leading to the correct fold has been obtained from experiments involving the controlled unfolding of proteins. The classic experiment by Anfinsen illustrated the reversibility of protein folding with regeneration of full enzymatic activity [8]. If the induced unfolding is completely reversible then an understanding of the unfolding pathway will shed insight into possible folding pathways. The folding is induced by changes in the conditions such as pH and/or temperature or by the addition of denaturing agents including urea or gaunidinium chloride. Ideally the intermediates along the folding/unfolding pathway would be structurally characterized. However, due to their instability this is often not possible (although recently some denatured structures have been characterized by NMR [9], not an easy feat given that in the unfolded state most nuclei experience similar magnetic environments, greatly reducing the signal resolution). One very facile method to probe the intermediate states is with hydrogen exchange [10]. This method utilizes the fact that the amide protons of the protein chain are slightly acidic and the rate of exchange can be altered by pH. At various stages during the unfolding process, deuterium oxide (2H20) is used in place of water and therefore all amide protons that are exposed to and exchanging with the solvent at that time will be replaced with a deuteron. The quantity of deuterons that the protein possesses after refolding can be readily determined by mass spectrometry; the specific location of the sites of exchange within the protein must be determined by nuclear magnetic resonance [10]. The first basic tenet of protein-structure prediction is that the amino acid sequence, the primary structure, contains all of the information required for the correct folding of the polymer chain. This is a first approximation which clearly ignores the role of environment on the induction of structure or the action of chaperone proteins which assist the m vivo folding process. The wide variety of structural motifs that have been observed for proteins is derived from only twenty different monomers (amino acids), many of which are structurally quite similar (i.e., isoleucine and leucine vary only in branching of the butyl side chain). However, there are many cases in which the substitution of amino acids with structurally similar residues (so-called conservative substitution) will lead to a protein that will not properly fold. Studies involving deletion of even small portions of the termini of the protein sequence provide similar results. On the other hand there are proteins related through evolution with as little as 20% sequence identity which adopt similar three-dimensional structures. Therefore the information encoded in the primary sequence is specific for one protein fold, however, there are numerous other sequences, only remotely related at first glance, which will produce the same fold.
Methods
There are many different methods to attack the problem of predicting protein structure. If the protein is a member of a family of closely related proteins, the structure can be postulated
641 through homology analysis. If this is not the case, one can envision a sequential approach in which the secondary elements are predicted from the primary sequence and then different topological orientations of these elements are examined and judged using some energyfunction description of the protein. To reduce the number of topological arrangements the search could be biased towards the arrangements previously observed in proteins of known structure. Other methods are based on the observation that protein structures are tightly packed with almost no vacant space in the interior or core of the protein. Such a tight packing of the amino acids then suggests that the arrangements of the polypeptide chain can be restricted to the points of a lattice, greatly reducing the number of possible orientations. These methods do not differentiate between secondary and tertiary structure nor depend on previously determined protein structures.
HomologyModeling By far and away the prediction method with the greatest accuracy is homology modeling. If there exists a protein sequence, with a known three-dimensional structure, that has a similar primary sequence then the level of confidence that the protein in question will adopt a similar fold is quite high. A model can be built using the known sequence and the resulting structure analyzed for energetically favorable (and unfavorable) contacts (e.g., charge-charge interactions described by a Coulombic interaction). Despite the specificity of the primary sequence for a particular fold discussed above, a sequence identity of only 30% is sufficient for this method to work. More sophisticated methods are available, that utilize not only the primary sequence but the functional properties of the protein and therefore greatly expand the range of homologous proteins [11 ]. Homology modeling has been used with great success in the class of transmembrane protein receptors [12]. Simply given the presence of seven regions of hydrophobic amino acids, model tertiary structures can be constructed. In many instances these models have provided insight for experimental data and for accurate predictions of effects of mutations of specific amino acids [13]. This will be discussed in greater detail below. For homologous sequences, the variations will be for the most part limited to the exterior portion; the central, mainly hydrophobic, core of the proteins will be very similar [14]. Of course, there will be variations, elongation of an or-helix or 13-sheet, to accommodate for specific sequence differences, but these will usually be minor. The greatest differences are therefore found in the hydrophilic loop regions which connect the elements which make up the core of the protein. These segments containing polar and charged amino acids are irregular in structure but certainly not structure-less. Data bases of loop structures have been created and the prediction of the correct loop conformation is usually made with a level of confidence equal to that of the presence of secondary structural elements [ 15,16]. A great deal of information can be obtained from the primary sequence simply by mapping out the hydrophobic and hydrophilic nature of the amino acids. Numeric scales for the hydrophobicity of the amino acids have been developed [ 17,18]. By calculation of a running average of this scale, the identification of transmembrane or membrane-associated regions can be readily identified. This is particularly true for proteins with one region that passes
642 through the cellular membrane, such as the epidermal growth factor receptor. The identification of a single stretch of 18-25 hydrophobic amino acids clearly indicates the transmembrane region. For proteins with multiple transmembrane domains, it is not necessary to have exclusively hydrophobic amino acids; a pair of amino acids with opposite charges may be present in the lipophilic environment of the membrane. Therefore a search for amphipathic or-helices must be undertaken. Amphipathic helices have well-defined hydrophobic character, the hydrophobic face which would project towards the membrane/lipid environment, and a hydrophilic face, which would project out into the aqueous phase or towards the core of a helix bundle. Often times the distinction is not clear and there are regions of mixed hydrophobic/hydr0philic character. Graphically this can be realized with a helical-wheel representation in which the amino acid side chains project out, at 100 degree intervals, from the view along the long, helical axis.
,.._j
Figure 3: Helical wheel representation illustrating the hydrophilic (boxed amino acids) and hydrophobic (circled amino acids) character of an amphipathic helix. All of the hydrophobic residues are on one face of the helix (bottom half of the figure), while the hydrophilic residues are on the top half. The hydrophobic/hydrophilic nature of the different helical faces is clearly illustrated by this projection. The amphipathic character of a segment of a protein can be calculated by use of a Fourier series and calculation of a hydrophobic moment [ 18].
M- {[
sin(jo )] 2 + [ZI~ cOS(jr'O)]2} 1/2
643 The hydrophobic moment gives an orientation of the a-helix with respect to the lipid environment and the aqueous interface of the membrane surface for single a-helices or the hydrophilic core formed from a cluster of transmembrane a-helices (seven and twelve helices are quite common). Other methods, conceptually along similar lines, have produced power spectra which are cleaner, facilitating the identification of the lipophilic moment [ 12]. Now that the location of the helices within the primary sequence of the protein, as well as the relative orientation of each helix with respect to the lipid environment have been identified, the only task that remains is mapping the sequence onto the three-dimensional arrangement of the helices. In the seven-transmembrane receptor field the standard has been the structure of bacteriorhodopsin, which does not couple to a G-protein, determined by electron cryomicroscopy [ 19]. More recently a low-resolution electron crystallographic image of bovine rhodopsin, which does couple to a G-protein, has appeared in the literature [20]. These structures illustrate that the helices are not aligned parallel through the membrane but are titled at angles forming binding pockets within the core and between the helices themselves. The sequences identified above are mapped onto the helices obtained from these experimentally determined topologies. The hydrophobic moments are then used to rotate the helices about the long axis to maximize the fit with respect to the lipid environment. Based on the transmembrane helices of the bacterial reaction center, look up tables on the environmental preference of amino acids have been developed [12]. The table containing values for the three different environments found in these receptors (i.e., lipid environment, hydrophilic core, or helix-helix interface) allows for fine tuning of the helical orientation. Models for a large number of seven transmembrane-helix proteins have been determined. Many of these models have been successfully used to provide atomic insight into the genetic manipulation of the receptors.
Secondary Structure Prediction Every since Pauling described the existence of well-defined secondary structural elements [4,5], a great deal of work towards the accurate prediction of the location of these elements has been carried out. Given the accurate location of the a-helices, B-sheets, and turns the only remaining task would be the correct folding of these elements. This is still a formidable task, as highlighted below, but the number of available conflgurational folds is greatly reduced in this manner. The accuracy of these methods is currently hovering about 60-70 %. These methods are based on the premise that a protein segment of a specified number of residues has a unique conformation (i.e., the secondary element is derived from local interactions) which can be identified in a database of known protein structures [21,22]. It was shown that the length of the segment is important for the success of this procedure. Too short of a segment, and there would be no common structural features: a segment consisting of five residues (pentapeptides) was shown to be insufficient [23]. Using too long of a segment would leave one with too few examples in the database, and therefore the preference for a secondary structure would not be well-defined. This general approach lends itself to the use of neural nets [24-26]. Using neural nets with the inclusion of evolutionary information of the protein sequences has produced a method with greater than 70% accuracy [27].
644 Once the secondary structural elements have been identified, the task of putting the "pieces" of the "puzzle" together must be undertaken. Although for many questions of biological or pharmaceutical interest the prediction of the secondary structural elements is sufficient. If a particular structure-activity relationship requires a helix-loop-helix motif, the prediction of secondary structural elements without regard for the topological arrangement allows one to ascertain if the new protein will be a likely candidate for further investigation. If the mapping of the secondary elements is not sufficient, it is necessary to proceed to combine the "pieces" into a tertiary structure. The first step, as alluded to above, is the development of possible loop conformations which connect the regions of secondary structure. The loops which do not fit into the well-defined category of or-helices or ~3-sheets have been fairly well characterized using the data base of proteins for which the three-dimensional structure is known [ 15,16]. The identification of specific loop conformations provides insight into the possible orientations, or at least provides limitations on the possible orientations, of the various secondary structural elements. The second step is then analysis of the array of amino acids within the secondary structural elements with attention to the environment in which the amino acids would be found. It is clear that a cluster of hydrophobic amino acids would not likely be projecting into the aqueous solution, and more likely projecting into the core of the protein. This analysis provides additional restrictions to the number of possible arrangements in which the secondary structural elements may be found. Another approach is to map the arrangement of secondary structural elements onto the known tertiary structures of other proteins. Currently, approximately one hundred unique protein folds have been identified. There is some question as to if this is an upper limit. If this is indeed the case, then the protein of unknown structure must adopt a known topological fold. The secondary structural elements are mapped onto the template of the different known protein structures. The best fits, as judged by the environmental factors (solvent accessibility) of the individual amino acids, are then further analyzed as probable folds. This procedure is referred to as threading the secondary elements into three-dimensional structures [28].
Primary to Tertiary Prediction Many different methods have been developed which skip over the prediction of secondary structural elements and proceed directly from the primary sequence to the protein fold [2933]. The most straightforward again involves the concept of threading. From a database of known protein structures, templates are created. These templates contain the relative topological arrangement of all the different secondary structural elements. The loops between them have been removed to allow for variability in the number of amino acids (again differences between homologous proteins are usually located in the loop regions). The original sequence is not retained but instead at each location of an amino acid a place holder is created. The protein sequence with unknown structure is then threaded through this template like a string of beads, with each bead representing an amino acid that is located in a place holder or within the variable loop domains. Then an energy or a measure of the fit is calculated and the one with the lowest value is deemed the most probable structure.
645
One advantage is that the template and test protein do not need to be of similar lengths. A very good fit could be identified for the N-terminal portion of a very long test sequence by a much shorter template. Large proteins often adopt different structural domains with identifiable folds. Likewise, a short test sequence could adopt a fold that utilizes only a small portion of the template. This rather straightforward sounding method avoids the problems associated with the identification of secondary structure elements. The assumptions are that most protein folds have already been identified and therefore the unknown structure of the test protein will most likely resemble a fold within the database. It is clear that a novel protein fold will not be identified by this method. Other problems are in the flexibility of amino acid replacement. There are many instances in which a large amino acid in the test sequence replaces a smaller one in the template. This could be energetically costly, while in nature these replacements occur with a few rotations about side chain dihedral angles or very small adjustments of the backbone dihedral angles [29]. The calculation of the energy or the fitting of the test sequence in the fold of the template is no easy matter. The utilization of a full force field with complete atom representation does not properly discriminate between the different folds [31]. This seems to be related to an energy surface that is too fine and the presence of numerous local minima. In its place a potential function based on a statistical analysis of known protein structures has been developed [34]. The pair-wise penalty function provides a pseudo-energy based on the number of times the specific interaction has been observed in known protein structures. This function provides amino acid-amino acid interactions as well as a measure for the solvent exposure of each amino acid [34]. Another similar approach is to use the known three-dimensional structures to create look-up tables which contain the most favorable environmental parameters of each amino acid. The parameter sets are created in terms of secondary structure, hydrogen bonding pattern, solvent accessibility, and local presence of polar atoms [30,32]. In this manner the three-dimensional information is encoded into a one-dimensional string. A comparison is then made of the test protein sequence with this one-dimensional string. If the test sequence is similar, a model fold can be created for further analysis.
Energetic Force Fields In contrast to the methods mentioned above which differentiated between primary and secondary structure or utilized a data base of known protein structures, there is the possibility of utilizing one of the many potential energy functions which have been shown to accurately reproduce many features of proteins, including thermodynamics and molecular motions. The potential energy force fields vary in specific details, mainly depending on the target molecule for which they were developed. A very typical energy force field is shown below.
646 bonds U
.._
+
I k ( R - R0) 2 2
Bonds
_1 k (0 - 00) 2 2
Angles
dihedrals
+
Y~ _1 k [cos(n~ + n) + 1] 2
+ ~
cLqj
+ ~
A _ B
Coulombic
8r
r6
Dihedrals
Lennard-Jones
rl2
Figure 4: Typical potential energy force field for a protein. A simple minded, "brute force," approach would entail the generation of the configurational fold that would minimize the potential energy given by this or similar energetic function. The problem resides in the vast number of conformations (or degrees of freedom) available to the protein. Assuming an average of four dihedral angles per amino acid and that each of these dihedral angles have six possible values, produces over 1200 available states. This is for each amino acid. Clearly a systematic sampling of all of the available states is not possible: the protein folds too fast for this to occur. How a protein is able to accomplish exactly this feat is at the essence ofLevinthal's paradox [7]. With current computational power, the utilization of such a force field in a molecular dynamics or simulated annealing approach, can sample time scales of up to a few nanoseconds not nearly sufficient for the folding of the protein chain (typically on the order of seconds). Given the correct fold, molecular dynamics simulations can provide great insight into small fluctuations about the global configuration, but the full atom representation is not suited for searching the large number of possibilities available to a protein starting from the primary sequence. Therefore, if such simulations are going to be useful for the ab initio searching of global folds, simplifications must be made. It should be noted that full atom representations have been used to examine the unfolding of small peptides with the hope of gaining insight into protein unfolding [35,36]. Reduced Atom Representation One mode of simplification concerns the representation of the polymer itself. The backbone of the protein polymer is illustrated below.
647
et Ci+l.
1
Nil. 1
""~fNi /
~Ci_ 1
CiV_l
Q-i Figure 5: Schematic of protein backbone.
The first simplification is the reduction of the side chain atoms to a single point. Often this point is attached to the position of the beta-carbon (denoted 13C in Figure 5). The radius of the point is defined by the relative size of the side chain. The point can be given charges to mimic the acidic or basic nature of the residue. By variation of the non-bonded parameters the point representation of the side chain can be treated as hydrophobic or hydrophilic. This simplification can reduce the number of atoms by up to 70% depending on the sequence of the protein. Removal of the side chain dihedral angles greatly decreases the number of degrees of freedom available to the protein. However, this comes at a cost of losing many of the fine structural features of the different side chains; these simply cannot be differentiated with the non-bonded parameters of a single point. Given the planarity of the amide linkage between amino acids, the distance between all of the alpha-carbons is 3.8 A. A second simplification is to maintain a constant distance of 3.8 A between the alpha-carbons (denoted otC in Figure 5)and to remove the amide (denoted as HN) and carbonyl (denoted as C'-O') functionalities. The combination of these two simplifications reduces the peptide chain to two points per amino acid. It is not a great leap to reduce the polymer chain to one point with a distance of 3.8 A between each point. The problem with this second simplification is the loss of the hydrogen bonding capability.
648 Although, hydrogen bonding does not seem to play a vital role in the folding of the protein chain, the importance in the stabilization of the final fold has not been questioned. Therefore, the removal of the capability to form hydrogen bonds may cause problems: not in the analysis of the folding process itself, but in the identification and proper calculation of the energy during the sampling of the available states. Recently a series of polymers in which the amino acid side chain was attached to the amide nitrogen rather than to the alpha-carbon were synthesized. These polymers referred to as peptoids do not fold in the same manner as the corresponding peptides. This is evidence that the hydrogen bonding is indeed important for the definition of the global minimum energy structure. It therefore seems imperative to maintain at least three atoms for an accurate representation of the protein chain: NH point to represent the amide group, the 13-carbon point with appropriate charge/character (size, polarity) to reproduce the amino acid side chain, and a CO point for the carbonyl group. Both the NH and CO points are able to form hydrogen bonds. We have recently developed a force field for this reduced atom representation for use in a simulated annealing molecular dynamics simulation [3 7]. The method we use is a hybrid distance geometry approach with a refinement using a molecular dynamics force field with the reduced atom representation. The distance geometry method utilizes the approach of metric-matrix distance geometry as described by Havel and Crippen [38]. The three points for each amino acid are defined from a starting structure of undefined conformation but with accurate bond distances and bond angles. The NH and CO point are taken as the geometric point of the amide and carbonyl groups, respectively. The standard backbone dihedral angles, t~ and W, defined as CO-HN-CB-CO and HN-CB-CONH, respectively, are completely free to rotate. The upper and lower distances are calculated using standard geometry by assuming free rotation. The distances are then further refined by utilization of the law of triangular inequality, resulting in tighter upper and lower distance bounds. A pair of atoms is then randomly chosen and a distance between their upper and lower limits is randomly chosen, a process defined as random metrization [39]. Once the exact distance between one pair of atoms is chosen, the upper and lower distances between the other atoms can be further tightened by re-applying the triangular inequality law. Another atom pair is chosen randomly and the process is repeated producing a real, symmetric matrix of distances. The Eigenvectors associated with the largest Eigenvalues from the diagonalization of this matrix can be used as the principal axes for the Cartesian coordinates of the conformation fulfilling the chosen distances. There are a number of advantages to using the distance geometry approach. The generation of the distance matrix is completely general; no conformations are excluded. In addition, since the molecular constitution is described by distances one is not limited to three spatial dimensions. Utilization of Eigenvectors associated with the four largest Eigenvalues produces coordinates, which are consistent with the tIn'eedimensional protein, in four-dimensions. Higher dimensionality has been used by a number of different groups to simplify the searching of conformational states [40,41]. A threedimensional object can "tunnel" through itself in four dimensions. Casting an object of Npoints into N-1 space allows for the calculation of the global minimum in one step of energy minimization. The subsequent reduction of the dimensionality followed by energy minimization should produce the global minimum [41 ].
649
The resulting four-dimensional structures are then refined with a simple molecular dynamics force field which is solely based on distances (i.e., Coulombic potential, Lennard-Jones nonbonded potential) and therefore is fully consistent with the higher dimensionality. After the simulated annealing protocol, the Cartesian coordinates of the structure is converted back to a real, symmetric matrix which is then diagonalized. The Eigenvectors associated with the three largest Eigenvalues are then used as the principal axes in the generation of the threedimensional Cartesian coordinates. The three-dimensional structure is then further refined by addition of all of the atoms and utilization of a force field for the full atomic representation. Similar procedures have been reported by other research groups [42].
Reduced Conformational Space~Lattice Models Another approach to simplify the protein chain is to reduce the conformational space allowed to the protein. The argument is simply that one of the major forces in protein structure is the formation of a core, a hydrophobic core, in which the side chains are tightly packed with no free volume. It is indeed postulated to be one of the first steps in protein folding. If this is the case, the core of the protein can be simplified to the points on a lattice [43-47]. A lattice, commonly 27 points are utilized, reduces the number of possibilities that need to be examined. Each amino acid is treated as either a single point, the alpha-carbon, or includes both the alpha- and beta-carbons, and assigned to one of the points of the lattice. The searching can be carried out with simulated annealing, molecular dynamics or Monte Carlo algorithms. Given the reduced atom representation and the lattice restricting the locations, one can address Levinthal's paradox since each and every possible configuration can be examined. One recent study [44], used a protein of known structure. They examined every possible conformation and whether it lead to the correct protein fold, in essence the kinetics of the folding process. Surprisingly, they found no correlation between the presence of secondary structural elements and finding the correct fold. Instead the important feature that lead to the correct fold was the presence of the native state as a well-defined energy minimum. The correct energy landscape of configuration is the determinant feature for the correct folding. The idea of the folding process being driven by the available energy landscape has been discussed in the terms of protein folding funnels [48-50]. The folding polymer chains have been liken to glasses; below a certain critical temperature the protein exhibits glass-like properties. When the simulation is below this temperature, no folding occurs. Above this temperature, there is sufficient energy available to the system for the chain to locate the global minimum and adopt the correct tertiary structure. Such simulations are shedding new light onto the protein folding problem.
650
Conclusions In this contribution I have tried to highlight some of the various approaches currently being employed in the field of protein structure prediction. Of course, all of the different methods nor all of the contributions from the many groups working in the field could be covered. However, I hope that the reader will find this as a starting point, with the many references, for delving into this exciting and rapidly expanding area of research. It is clear that there is still a long way to go before the tertiary structure can be accurately predicted directly from the primary sequence.
Acknowledgments The author would like to thank Dr. Maria Pellegrini and Eduaro Mercurio for fruitful discussions and reading of the manuscript.
References 1. Bork, P., Ouzounis, C., Sander, C., Scharf, M., Schneider, R., Sonnhammer, E. "Whats in a genome?" Nature 1992 358, 287. 2. Wagner, G. "Prospects for NMR of large proteins" J. Biomol NMR 1993 3, 375-385. 3. Pauling, L., Corey, R. B. "The structure of synthetic polypeptides" Proc. Natl. Acad. Sci. USA 1951 37, 241-250. 4. Pauling, L., Corey, R. B., Branson, H. R. "The structure of proteins: two hydrogen bonded helical configurations of the polypeptide chain" Proc. Natl. Acad. Sci. USA 1951 37, 205-211. 5. Pauling, L., Corey, R. B. "Configurations of polypeptide chains with favored orientation around single bonds: two new pleated sheets" Proc. Natl. Acad. Sci. USA 1951 37, 729-740. 6. Kauzmann, W. "Some factors in the interpretation of protein denaturation" Adv. Protein. Chem. 1959 14, 1-63. 7. Levinthal, C. "Are there pathways for protein folding?" J. Chim. Phys 1968 65, 44-45. 8. Anfinsen, C. B. "Principles that govern the folding of protein chains" Science 1973 181, 223-230. 9. Logan, T. M., Theriault, Y., Fesik, S. W. "Structural characterization of the FK506 binding protein unfolded in urea and guanidine hydrochloride" J. Mol. Biol. 1994 236, 637648. 10. Dobson, C. M., Evans, P. A., Radford, S. E. "Understanding how proteins fold: the lysozyme story so far" Trends Biol. Sci. 1994 19, 31-37. 11. Sander, C., Schneider, R. "Database of homology-derived structures and the structurally meaning of sequence alignment" Proteins: Struc. Func. Genet. 1991 9, 56-68.
651 12. Donnelly, D., Overington, J. P., Blundell, T. L. "The prediction and orientation of ahelices from sequence alignments: the combined use of environment-dependent substitution tables, Fourier transform methods and helix capping rules" Prot. Engng. 1994 7, 645-653. 13. Schwartz, T. W., Rosenkilde, M. M. "Is there a 'lock' for all agonist 'keys' in TM7 receptors" Trends Pharm. Sci. 1996 17, 213-216. 14. Janin, J., Chothia, C. "Domains in proteins: definitions, location and structural principles" Methods Enzymol. 1985 115, 420-430. 15. Sibanda, B. L., Thornton, J. M. "13-haripin families in globular proteins" Nature 1985 316, 170-174. 16. Branden, C. Tooze, J. "Introduction to protein structure", Garland Publishing, Inc., New York, 1991. 17. Kyte, J., Doolittle, R. F. "A simple method for displaying the hydropathic character of a protein" J. Mol. Biol. 1982 157, 105-132. 18. Eisenberg, D., Weiss, R. M., Terwilliger, T. C. "The hydrophobic moment detects the periodicity in protein hydrophobicity" Proc. Natl. Acad Sci. USA 1984 82, 140-144. 19. Henderson, R., Baldwin, J. M., Ceska, T. A., Zemlin, F., Beckmann, E., Downing, K. H. "Model for the structure of bacteriorhodopsin based on high-resolution electron cryomicroscopy" J. Mol. Biol. 1990 213, 899-929. 20. Schertler, G. F. X., Villa, C., Henderson, R., "Projection structure of rhodopsin" Nature 1993 362, 770-772. 21. Chou, P. Y., Fasman, G. "Prediction of protein conformation" Biochemistry 1974 13, 222-245. 22. Goldman, N., Thorne, J. L., Jones, D. T. "Using evolutionary trees in protein secondary structure prediction and other comparative sequence analyses" J. Mol. Biol. 1996 263, 196208. 23. Kabash, W., Sander, C. "Dictionary of protein secondary structure: pattern recognition of hydrogen bonded and geometrical features" Biopolymers 1983 22, 2577-2637. 24. Holley, H. L., Karplus, M. "Protein secondary structure prediction with a neural network" Proc. Natl. Acad. Sci. USA 1989 86, 152-156. 25. Rost, B., Sander, C. "Combining evolutionary information and neural networks to predict protein structure" Proteins: Struc. Funct. Genet. 1994 19, 55-72. 26. Bohr, H., Bohr, J., Brunak, S., Cotterill, R. M. J., Lautrup, B., Norskov, L., Olsen, O. H., Peterson, S. B. "Protein secondary structure and homology by neural networks" P FEBS Letters 1988 241, 223-228. 27. Rost, B., Sander, C. "Prediction of protein secondary structure at better than 70% accuracy" J. Mol. Biol. 1993 232, 584-599. 28. Rost, B. "Fitting 1-D predictions itno 3-D structures", in Protein Folds, A distance-based approach", Bohr, H., Brunak, S., Eds., CRC Press, 1996, pp. 132-151.
652 29. Ponder, J. W., Richards, F. M. "Tertiary templates for proteins: use of packing criteria in the enumeration of allowed sequences for different structural classes" J. Mol. Biol. 1987 193, 775-791. 30. Bowie, J. U., Ltithy, R., Eisenberg, D. "A method to identify protein sequences that fold into a known three-dimensional structure" Science 1991 253, 164-170. 31. Jones, D. T., Taylor, W. R., Thornton, J. M. "A new approach to protein fold recognition" Nature 1992 358, 86-89. 32. Johnson, M. S., Overington, J. P., Blundell, T. L. "Alignment and searching for common protein folds using a data bank of structural templates" J. Mol. Biol. 1993 231, 735-752. 33. Michie, A. D., Orengo, C. A., Thornton, J. M. "Analysis of domain structural class using an automated class assignment protocor' J. Mol. Biol. 1996 262, 168-185. 34. Sippl, M. J. "Calculation of conformational ensembles from potentials of mean force: an approach to the knowledge-based prediction of local structures in globular proteins" J. Mol. Biol. 1990 213, 859-883. 35. DiCapua, F. M., Swaminathan, S., Beveridge, D. L. "Theoretical evidence for destabilization of an Gt-helix by water insertion: molecular dynamics of hydrated decaalanine" J. Am. Chem. Soc. 1990 112, 6768-6771. 36. Brooks, C. L. "Molecular simulations of peptide and protein unfolding: in quest of a molten globule" Curr. Op. Struc. Biol. 1993 3, 92-98. 37. Mierke, D. F., Melcuk, A., Pellegrini, M. in preparation. 38. Crippen, G. M.; Havel, T. F. In Distance Geometry and Molecular Conformation, Research Studies Press LTD., Somerset, England; John Wiley, New York, 1988. 39. Havel, T. F. "An evaluation of computational strategies fir use in the determination of protein strucutre from distnace constraints obtained by nuclear magnetic resonance" Prog. Biophys. Molec. Biol. 1991 56, 43-78. 40. Crippen, G. M. "Conformational analysis by energy embedding" J. Comp. Chem. 1982 3, 471-476. 41. Purisima, E. O., Scheraga, H. A. "An approach to the multiple-minima problem by relaxing dimensionality" Proc. Natl. Acad. Sci. USA 1986 83, 2782-2786. 42. Aszodi, A., Taylor, W. R. "Folding polypeptide alpha-carbon backbones by distance geometry methods" Biopolymers 1994 34, 489-505. 43. Skolnick, J., Kolinski, A. "Dynamics monte carlo simulations of a new lattice model of globular protein folding, structure and dynamics" J. Mol. Biol. 1991 221, 499-531. 44. Covell, D. G. "Lattice model simulations of polypeptide chain folding" J. Mol. Biol. 1994 235, 1032-1043. 45. Sali, A., Shakhnovich, Karplus, M. "Kinetics of protein folding. A lattice model study of the requirements for folding to the native state" J. Mol. Biol. 1994 235, 1614-1636.
653 46. O'Toole, E. M., Panagiotopoulos, A. Z. "Effect of sequence and intermolecular interactions on the number and nature of low-energy states for simple model proteins" J. Chem. Phys. 1993 98, 3185-3190. 47. Socci, N. D., Onuchic, J. N. "Folding kinetics of proteinlike heteropolymers" Jr. Chem. Phys. 1994 101, 1519-1528. 48. Leopold, P. E., Motal, M., Onuchic, J. N. "Protein folding funnels: kinetic pathways through compact computational space" Proc. NatL Acad. Sci. USA 1992 89, 8721-8725. 49. Onuchic, J. N., Wolynes, P. G., Luthey-Schulten, Z., Socci, N. D. "Towards an outline of the topography of a realistic protein folding funnel" Proc. NatL Acad. Sci. USA 1995 92, 3626-3630. 50. Bryngelson, J., Onuchic, J. N., Socci, N. D., Wolynes, P. G. "Funnels, pathways and the energy landscape of protein folding" Proteins: Struc. Func. Genet. 1995 21, 167-195.
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modern Modelling of the Chemical Bond
655
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Possible sources of error in the c o m p u t e r simulation of protein structures and interactions J.M. Garcia de la Vega, a J.M.R. Parker, b and S. Fragac aDepartamento de Quimica Fisica Aplicada, Universidad Autonoma de Madrid, 28049 Madrid, Spain bAlberta Peptide Institute and Department of Biochemistry, University of Alberta, Edmonton, AB, Canada T6G 2S2 CDepartment of Chemistry, University of Alberta, Edmonton, AB, Canada T6G 2G2
1. INTRODUCTION The significance of the results to be obtained from the simulation of protein structures and interactions, abetted by the availability of software packages and of powerful workstations with superb graphic capabilities, is encouraging a staggering proliferation of published material. Unfortunately, the quality of the predictions may be affected by a number of possible sources of error so that, unless the information obtained from the simulation is contrasted with experimental data, the danger exists that the unaware user of a software package may be offering a discussion based on a biased and/or wrongly characterized structure. That is, 'it is important to be aware of the quality of the parameters in use' [ 1]. The purpose of this chapter is to highlight where those sources of error may be found in the simulation procedure and to that end we must examine its components. In this connection some comments regarding the terminology are in order. A model is either 'a description or analogy used to help visualize something that cannot be directly observed' or 'a system of postulates, data, and inferences presented as a mathematical description of an entity' while to model is 'to produce a representation or simulation of' and simulation is 'the imitative representation of the functioning of one system or process by means of the functioning of another', such as in 'a computer simulation of [2]. In that broad sense, modelling and simulation might be taken as being interchangeable and in the literature these two designations are often used as synonyms. In order to avoid confusion, however, we would prefer to consider the adoption of a model as one of the components of a simulation. That is, we will adopt the following scheme: A simulation is performed, within the framework of a theory, for a chosen model, the calculations being carded out by a given method, with those approximations that may be required for practical reasons and using the appropriate numerical techniques. Thus, if we restrict ourselves to those simulations that have a clear quantum-chemical component, in the simplest case one is basing the procedure on perturbation theory, with a model in which the protein is represented by a collection of point particles (i.e., the atoms), with some distance constraints in an attempt to account for the existence of bonds, using an energy minimization method with appropriate computational techniques, the final
656 characterization of the resulting structure being done in terms of atomic coordinates or dihedral angles, etc. The role of perturbation theory is simply to suggest the use of 1/Rexpansions for the evaluation of the interaction energy between non-bonded atoms. The actual expressions of the expansions may be obtained purely from theoretical results but also with some semiempirical considerations. [We will omit in this discussion the existence of additional terms in the potential energy functions as well as those functions obtained in a simplified form from experimental data, although some mention will be made of the latter.] What are then the deficiencies in such a procedure? Perturbation theory is not really applied as such, the model of point particles is rather poor, the 1/R-expansions contain illdefined parameters, the energy minimization is affected (in order to reduce the computing costs) by the practical consideration of a distance cut-off threshold in the evaluation of the interaction energy, and the routine characterization of the structure in terms of its dihedral angles may be misleading. Most of these deficiencies are also present in the more sophisticated Monte Carlo and Molecular Dynamics simulations. The model cannot be improved, given the size of the proteins, and the need for a compromise between accuracy and computing time in the evaluation of the potential energy may be avoided with more computing power. Consequently we will centre our attention on the errors associated with the parameters in the potential energy function and the characterization of the structure. 2. DEFICIENCIES OF POTENTIAL ENERGY FUNCTIONS A detailed account of the derivation of potential energy functions (PEF) has been published recently [3] and therefore only those points of interest for the present purpose will be considered here. The discussion will be centred on those theoretical and semiempirical PEF, which use 1/R-expansions for the non-bonded interactions. The Coulombic term of the electrostatic interaction between two non-bonded atoms, A and B, is given by qAqB/RAB, in terms of their effective charges, qA and qB, and where RAB denotes their separation; in PEF obtained by the fitting of theoretical results, this term may be affected by an appropriate constant. The problems associated with the use of effective charges are many: (a) The effective charges are not physical observables. They are artifacts of an analysis of the electron density obtained from the wave function for the molecule under consideration. As such they depend on the quality of the wave function (i.e., the method used for its determination and whatever approximations have been introduced) as well as on the population analysis adopted for their definition. The original population analysis, proposed by Mulliken [4], has been widely used but the effective charges obtained with it do not reproduce the electrostatic potential. Over the years a great variety of schemes have been developed for the description of the electronic density distribution, in terms of charges, multipole moments, etc. [5 - 41] as well as for the determination of effective charges from electrostatic potentials [42- 59]. (b) The size (i.e., the number of atoms) of a protein precludes at this moment the possibility of obtaining the corresponding wave function. Therefore, in order to be able to proceed with the simulation, recourse must be made of the use of approximate values for the effective charges. An approximation consists of the use of average effective charges for the various classes of atoms [60], obtained from the results of calculations for the individual amino acids.
657 (c) As a rule, when using average charges, the total charge of the individual amino acids will not be reproduced and a renormalization of the charges will be needed. The resulting charges will depend on the renormalization procedure adopted [61,62]. (d) When a peptidic chain is being constructed from the individual amino acids, the formation of the peptidic bond involves the removal of an OH-group from one of the amino acids and of a H-atom from the other [3]. As a rule, the total charge of the OH-group and the charge of the H-atom will not cancel each other and the resulting system will be affected by a total non-vanishing charge. The electric neutrality of the system must then be restored by a new renormalization of the charges. The construction of the peptidic chain may be performed in different ways, with a corresponding effect on the final charges. On one hand one could complete first the peptidic chain before any attempt at optimization is made, needing only one renormalization of the charges. One could, however, proceed with some partial optimization after each peptidic bond is formed, in which case as many renormalizations are required as there are peptidic bonds. It is also possible to construct the peptidic chain from preassembled fragments, which could have been built up in either of the two ways mentioned above. The result is that the final effective charges will be different depending on the path chosen. The values presented in Table 1 illustrate these differences. The differences will be, as a rule, small but might have an appreciable effect on the overall interaction energy. It is not possible to predict a priori what that effect will be because the changes in the effective charges may lead to either a cooperative effect or to an accidental cancellation in the summation Table 1 Renormalized effective charges for the amino N in the natural amino acids a
(b)
(c)
(d) ,,
(b)
(c)
(d)
ala
-0.557
-0.585
-0.570
leu
-0.544
-0.529
-0.528
arg
-0.569
-0.585
asn
-0.558
-0.561
-0.571
lys
-0.580
-0.565
-0.562
-0.547
met
-0.570
-0.553
-0.548
asp
-0.555
-0.546
-0.533
phe
-0.570
-0.548
-0.571
cys glu
-0.567
-0.554
-0.541
pro
-0.536
-0.507
-0.527
-0.557
-0.548
-0.534
ser
-0.589
-0.559
-0.571
gin
-0.559
-0.548
-0.542
thr
-0.564
-0.541
-0.549
gly
-0.589
-0.566
-0.594
trp
-0.570
-0.549
-0.554
his
-0.533
-0.512
-0.527
tyr
-0.574
-0.553
-0.556
ile
-0.544
-0.528
-0.530
val
-0.541
-0.529
-0.531
,
,,
(a) The starting average effective charge used was -0.554 [3]. The values in this table have been obtained with the software package maPSI (S. Fraga and S.E. Thornton, Department of Chemistry, University of Alberta, Edmonton, AB, Canada T6G 2G2), using one of the existing renormalization schemes [61]. (b) In the isolated amino acids, as given by Fraga et al. [3]. (c) In the polypeptide ala-arg-asn-asp-cys-glu-gln-gly-his-ile-leu-lys-met-phe-pro-ser-thr-trptyr-val, with renormalization after each peptidic bond is formed. (d) In the same peptide as above, but constructed from three fragments, with 7, 6, and 7 residues, respectively. For each fragment, renormalization was performed after the formation of each peptidic bond.
658 EZ qAqB/RAB. Just as an example of the possible effect for one term we may consider the interaction between the backbone N of the glycine and proline residues in the peptide in Table 1, constructed directly from the residues or from preassembled fragments (columns c and d). The differences (in absolute value) in the Coulombic energy, when using the effective charges in column (c) versus those in column (d) of Table 1, are 12.1, 9.1, 7.3, and 3.6 kJ, at 3, 4, 5, and 10 A of separation. It seems highly improbable that the final effective charges obtained in this fashion will be identical to the ones that would be derived, adopting the same population analysis, from the protein wave function determined at the same level of approximation as used for the individual amino acids. It is not possible to decide which set of effective charges is more correct, but one must conclude that the effective charges may be affected by an uncertainty and that the simulation may be biased. It must be pointed out, however, that if the fitting of the 1/R-expansion (from the theoretical results) were to be performed using renormalized average charges for the individual amino acids, then the only deficiency left is the one discussed in (d). The preceding list of possible deficiencies is not complete yet. The conformation of the protein under study evolves as the simulation proceeds: the bond angles and lengths within each residue may change and the relative separations and orientations of non-adjacent residues will change. In a proper quantum-chemical calculation such changes, which imply a change in the interactions, would result in a change in the electron density distribution in the protein, which would be reflected in a change in the effective charges (determined, of course, by the same population analysis adopted initially). This deficiency (a) will introduce a new bias in the evolution of the conformation of the protein along the simulation path; (b) may be particularly harmful in docking procedures in which a molecular association is formed, with strong interactions between the two parmers: (c) will also be present when using simplified expansions with fixed coefficients obtained from experimental data for conformations different from those that appear along the simulation path. In order to remedy this situation it has been suggested [54,63-79] that the simulation should be complemented with a quantumchemical component, but attention must be paid in such a case to the quality of the latter. That is, a quantum-chemical calculation at a low level of approximation will not necessarily correct the deficiency. 3. CONFORMATIONAL CHARACTERIZATION Changes in the dihedral angles, t~ and V, which represent about 1/10 of the total degrees of freedom in protein structures, are responsible for most of the large-scale movements in proteins [80]. Consequently, conformational searches are often performed through variation of those dihedral angles, with fixed geometry (i.e., fixed bond angles and distances), and the resulting structures identified by a listing of the final dihedral angles. Variation of either a single ~ or V angle of a peptide conformation (with fixed geometry) produces a large global conformational change. However, conformafional search methods, which vary two or more angles simultaneously, allow for local movement without large changes in the global conformations [81 ]. Since the C0t(i) - C'(i) and N(i+ 1) - Ca(i+ 1) bonds are nearly parallel, a cooperative change of v(i) and O(i+ 1) may be a possible source of local movement without a corresponding large global distortion [82,83]. Using rigid geometry it is not immediately apparent that such a movement will not change the global conformation, since those two bonds are not collinear but displaced by approximately 1 A. If they were colinear, a change A~(i) together with a change At~(i+l) = - A~(i) would maintain the global conformation. This change results in a rotation of the peptide plane by the same increment
659 given to ~g(i), and therefore this cooperative variation of ~(i) and ~(i+ 1) will be denoted as peptide plane rotation (PPR).(Such a procedure is similar to the one involving the use of a virtual or pseudo dihedral angle [84 - 85].) This PPR procedure has been tested in a decapeptide with a starting regular conformation corresponding to a fight-hand o~-helix. Eight additional structures were generated in each case by changing the values of xg(5) and t~(6) by 40 ~ increments, with the restriction that Ate(6) = -Axg(5), maintaining fixed all the remaining dihedral angles t~, ~ at their values in the original conformation; that is, the structures are identified by the values -160~ ~ -120~176176 20~ ~ 0~ ~ 40~ ~ 80~ ~ (= 80~176 120~ ~ (= 120~ ~ of the xg(i)/t~(i+l) angles. Figure 1 illustrates the results obtained; the maximum rmsd is 1.08
Figure 1. Conformations of nine fight-hand t~- helices. The starting conformation is presented at the top of the left column. The other structures are those with the values -80%20 ~ -120~ ~ -160~ ~ and 160~ ~ (on the left column) and 120~ ~ 80~ ~ 40~ ~ and 0~ ~ (on the fight column) for the ~(5)/~(6) angles. See the text for details. and the maximum change in the C~t(1) - Ca(10) distance is 1.2 A. A similar test has been carded out for a left-hand o~-helix, Type I and II 13-turns, and an extended structure, with the similar result that PPR allows for local conformational changes without large global conformational changes.
660 Next, Pancreatic Trypsin Inhibitor (5PTI) was considered because of its characteristics (i.e., a well-defined X-ray structure, with a conformation restricted by three disulfide bonds). The peptidic bond between residues 27 and 28 was selected for the test because residue 28 forms part of a loop with irregular conformation and is not involved in H-bonding. As in the preceding cases, eight structures were again generated with changes of 40 for ~(27) and ~(28), with the restriction A~(28) = -A~(27). The procedure was carried out with InsightlI (Biosym) minimizing the structure using dihedral angles restraints and a tether force for residues 1 - 23 and 32 - 51. This procedure maintains the original X-ray structure for those two regions and allows for flexible geometry optimization of the 24 - 31 region. Figure 2
Figure 2. Superimposed conformations of nine structures of Pancreatic Trypsin Inhibitor (5PTI), including the original X-ray structure. The remaining eight structures have been generated as described in the text, with application of the PPR method. shows, superimposed, the final structures, as well as the original one. The values obtained for the bond angles and distances are consistent with the ranges observed in high-resolution X-ray determination of several PTI structures.
661 The above conformations describe local movements with little global changes. The point to be emphasized, however, is that a characterization of those conformations by a listing of their ~, V values for each residue would seem to suggest that very different conformations were being considered when, in fact, they are very similar when analyzed from the point of view of a PPR. 4. CONCLUSIONS The cautionary tone in part of this chapter has been adopted on purpose, in order to emphasize our opinion that the quality of computer simulations, using PEF with fixed coefficients, cannot be ascertained except by comparison with experimental information. However, once that comparison has been made, the simulation results may be extremely useful. That is why a close collaboration between experimental and theoretical researchers is strongly recommended. The problem regarding the characterization of a structure in terms of its dihedral angles is of a different nature. In fact, the interest of the evidence presented lies in the result that a situation may arise such that a structure is rejected, even though the global conformation is essentially correct, simply because PPR has not been taken into account. The PPR has also implications in conjunction with the thermodynamic hypothesis [86,87], the Levinthal paradox [88], the use of redundant conformations in conformational searches, and the possibility of an enhanced search procedure (projected angle method). These points lie outside the scope of the present work and will be discussed in detail in future work [89]. REFERENCES
.
3. .
5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
K. Gundertofte, T. Liljefors, P.-O. Norrby, and I. Pettersson, J. Comput. Chem., 17 (1996) 429. Webster's New Collegiate Dictionary, Thomas Allen & Son Limited, Toronto, 1981. S. Fraga, J.M.R. Parker, and J.M. Pocock, Computer Simulations of Protein Structures and Interactions, Springer, Berlin, 1995. R.S. Mulliken, J. Chem. Phys., 23 (1955) 1833,1841,2338, 2343. J. Hinze and H.H. Jaffe, J. Am. Chem. Soc., 84 (1962) 540. E.R. Davidson, J. Chem. Phys., 46 (1967) 3320. M. PoUax and R. Rein, J. Chem. Phys., 47 (1967) 2045. I.H. Hillier and J.F. Wyatt, Int. J. Quantum Chem., 3 (1969) 67. P. Politzer and R.R. Harris, J. Am. Chem. Soc., 92 (1970) 6451. R.F.W. Bader, P.M. BedaU, and P.E. Cade, J. Am. Chem. Soc., 93 (1971) 3095. R.E. Christoffersen and K.A. Baker, Chem. Phys. Lett., 8 (1971) 4. P. Politzer and R.S. Mulliken, J. Chem. Phys., 55 (1971) 5135. P. Politzer and P.H. Reggio, J. Am. Chem. Soc., 94 (1972) 8308. J.R. Rabinowitz, T.J. Swissler, and R. Rein, Int. J. Quantum Chem., 6 (1972) 353. G.A. Gallup and J.M. Norbeck, Chem. Phys. Lett., 21 (1973) 495. K. Jug, Theor. Chim. Acta, 29 (1973) 9; 31 (1973) 63; 39 (1975) 301. R. Rein, Adv. Quantum Chem., 7 (1973) 335. D. Dovesi, C. Pisani, R. Ricca, and C. Roetti, J. Chem. Soc. Faraday Trans., 2 (1974) 1381. K.R. Roby, Mol. Phys., 27(1974) 81; 28 (1974) 1441. A. Julg, Topics in Current Chemistry, vol. 58, Springer, Berlin, 1975. T. Okada and T. Fueno, Bull. Chem. Soc. Japan, 49 (1976) 1524.
662 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56.
F.L. Hirschfeld, Theor. Chim. Acta, 44 (1977) 129. F.A. Momany, J. Phys. Chem., 82 (1978) 592. J. Gasteiger and M. Marsili, Tetrahedron, 36 (1980) 3219. S. Huzinaga and S. Narita, Israel J. Chem., 19 (1980) 242. S. Iwata, Chem. Phys. Lett., 69 (1980) 305. J.O. Noell, Inorg. Chem., 21 (1980) 11. M.D. Guillen and J. Gasteiger, Tetrahedron, 39 )1983) 1331. J. Fernandez Rico, J.R. Alvarez Collado, and M. Paniagua, Mol. Phys., 56 (1985) 1145. J. Fernandez Rico, R. Lopez, J.M. Garcia de la Vega, and J.I. Fernandez Alonso, J. Mol. Struct. (Theochem), 120 (1985) 163. A.E. Reed, R.B. Weinstock, and F. Weinhold, J. Chem. Phys., 83 (1985) 735. J. Fernandez Rico, R. Lopez, M. Paniagua, and J.I. Fernandez Alonso, Int. J. Quantum Chem., 29 (1986) 1155. J. Fernandez Rico, J.R. Alvarez Collado, M. Paniagua, and R. Lopez, Int. J. Quantum Chem., 30 (1986) 671. A.E. Reed, L.A. Curtiss, and F. Weinhold, Chem. Rev. 88, (1988) 899. F. Weinhold and J.E. Carpenter, The Structure of Small Molecules and Ions, Plenum, New York, 1988. J.M. Garcia de la Vega, R. Lopez, J.R. Alvarez Collado, J. Fernandez Rico, and J.I. Fernandez Alonso, In Molecules in Physics, Chemistry and Biology, vol. 3, edited by J. Maruani, Kluwer, Dordrecht, 1989. R.L. Nalewajski, K.V. Genechten, and J. Gasteiger, J. Am. Chem. Soc., 197 (1989) 829. F. Colonna, J.F. Angyan, and O. Tapia, Chem. Phys. Lea., 172 (1990) 55. K.T. No, J.A. Grant, and H.A. Scheraga, J. Phys. Chem., 94 (1990) 4732. K.T. No, J.A. Grant, M.S. Jhon, and H.A. Scheraga, J. Phys. Chem., 94 (1990) 4740. R.J. Boyd and J.M. Ugalde, Analysis of Wave Functions in Terms of One- and TwoElectron Density Functions, in Computational Chemistry. Structure, Interactions and Reactivity, edited by S. Fraga, Elsevier Science Publishers, Amsterdam, 1992. S. Kim, M.S. Jhon, and H.A. Scheraga, J. Phys. Chem., 92 (1980) 7216. S.R. Cox and D. Williams, J. Comput. Chem. 2 (1981) 304. L.E. Chirlian and M.M. Francl, J. Comput. Chem., 8 (1987) 894. D.E. Williams and J.M. Yan, Adv. At. Mol. Phys., 23 (1988) 87. U. Dinur and T.A. Hagler, J. Chem. Phys., 91 (1989) 2949. B.H. Besler, K.M. Merz, and P.A. Kollman, J. Comput. Chem., 11 (1990) 431. C.M. Breneman and K.B. Wiberg, J. Comput. Chem., 11 (1990) 261. G.G. Ferenczy, C.A. Reynolds, and W.G. Richard, J. Comput. Chem., 11 (1990) 159. F.J. Luque, F. Illas and M. Orozco, J. Comput. Chem., 11 (1990) 416. S.S. Wee, S. Kim, M.S. Jhon, and H.A. Scheraga, J. Phys. Chem., 94 (1990) 1655. R.J. Woods, M. Khalil, W. Pell, S.H. Moffat, and W.H. Smith, Jr., J. Comput. Chem., 11 (1990) 297. C. Chipot, B. Maigret, J.L. Rivail, and H.A. Scheraga, J. Phys. Chem., 96 (1992) 10276. K.M. Merz Jr., J. Compul~ Chem., 13 (1992) 749. C. Chipot, J. Angyan, G. Ferenczy, and H.A. Scheraga, J. Phys. Chem., 97 (1993) 6628. C. Chipot, J. Angyan, B. Maigret, and H A. Scheraga, J. Phys. Chem., 97 (1993) 9788.
663 C. Chipot, J. Angyan, B. Maigret, and H.A. Scheraga, J. Phys. Chem., 97 (1983) 9797. 58. J. Cieplak, W.D. Cornell, C. Bayly, and P.A. Kollman, J. Comput. Chem., 16 (1995) 1357. 59. S. Tsuzuki, T. Uchimaru, K. Tanabe, and A. Yliniemela, J. Mol. Struct. (Theochem) 365 (1996) 81. 00 E. Clementi, Computational Aspects for Large Chemical Systems, Springer, Berlin, 1980. 61. S. Fraga, J. Comput. Chem., 3 (1982) 329. 62. E.A. Bidacovich, S.G. Kalko, and R.E. Cachau, J. Mol. Struct. (Theochem), 210 (1990) 455. 63. A. Warshel and M. Levitt, J. Mol. Biol., 103 (1976) 227. 64. A. Warshel and R.M. Weiss, J. Am. Chem. Soc., 102 (1980) 6218. 65. U.C. Singh and P.A. Kollman, J. Comput. Chem., 7 (1986) 718. 66. M.J. Field, P.A. Bash, and M. Karplus, J. Comput. Chem., 11 (1990) 700. 67. P.A. Bash, M.J. Field, R.C. Davenport, G.A. Petsko, D. Ringe, and M. Karplus, Biochemistry, 30 (1991) 5826. 68. J. Gao, J. Phys. Chem., 96 (1992) 6432. 69. V.V. Vasilyev, A.A. Bliznyuk, and A.A. Voytiuk, Int. J. Quantum Chem., 44 (1992) 897. 70. J. Aqvist and A. Warshel, Chem. Rev., 93 (1993) 2523. 71. P.D. Walker and P.G. Mezey, J. Am. Chem. Soc., 115 (1993) 12423. 72. U. Sternberg, F.-T. Koch, and M. Mollhoff, J. Comput. Chem., 15 (1994) 524. 73. V. Thery, D. Rinaldi, J.-L. Rivail, B. Maigret, and G. 74. G. Ferenczy, J. Comput. Chem., 15 (1994) 269. 75. V.V. Vasilyev, J. Mol. Struct. (Theochem), 304 (1994) 129. 76. U. Koch and E. Egert, J. Comput. Chem., 16 (1995) 937. 77. R.V. Stanton, D.S. Hartsough, and K.M. Merz Jr., J. Comput. Chem., 16 (1995) 113. 78. D. Bakowies and W. Thiel, J. Comput. Chem., 17 (1996) 87. 79. M. Freindorf and J. Gao, J. Comput. Chem., 17 (1996) 386. 80. L.M. Rice and A.T. Brunger, Proteins: Struct. Funct. and Genet., 19 (1994) 277. 81. A. Elofson, S.M. LeGrand, and D. Eisenberg, Proteins: Struct. Funct. and Genet., 23 (1995) 73. 82. W.L. Peticolas and B. Kurtz, Biopolymers, 19 (1980) 1153. 83. J.A. McCammon and S.H. Northrup, Biopolymers, 19 (1980) 2033. 84. M. Levitt, J. Mol. Biol., 104 (1976) 59. 85. R.S. DeWitte and E.I. Shakhnovich, Protein Sci., 3 (1994) 1570. 86. C.B. Anfinsen, Science, 181 (1973) 223. 87. K.A. Dill, Biochem., 24 (1985) 1501. 88. C. Levinthal, in Mossbauer Spectroscopy in Biological Systems, edited by P. Debrunner, J.C.M. Tsibris, and E. Munck. Proceedings of a meeting held at Allerton House, Monticello, IL (1969). 89. J.M.R. Parker, to be published.
57.
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modern Modelling of the Chemical Bond
665
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
T h e nature o f V a n der W a a l s b o n d Grzegorz Chatasifiski a, Malgorzata. M. Szcz~niak b, and Stawomir M. Cybulski c aDepartment of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warszawa, Poland bDepartment of Chemistry, Oakland University, Rochester, Michigan 48309, United States of America CDepartment of Chemistry and Biochemistry, Miami University, Oxford, Ohio 45056, United States of America
1. I N T R O D U C T I O N Linus Pauling has had a lasting impact on two areas of the theory of molecular interactions [1,2]. He was the first to recognize the importance of weak interactions, such as hydrogen bonding to biology, and he pioneered the theoretical investigations of these interactions. His second powerful idea was the role of molecular shape in such diverse areas as crystal packing and a complementarity of enzyme-substrate interactions. By his own admission Pauling never used a computer in his studies [3]. In a recent reminiscence he said: "I am sure that if I had been relying on a computer to make most of the calculations, some of these ideas, which have in fact turned out to be important, would not have occured to me." By contrast, today's applications of quantum chemistry to these systems are almost exclusively carried out on a computer, taking advantage of unprecedented advances in solving the Schr6dinger equation. The computational methods have become reliable enough so as to provide us with potential energy surfaces, which are suitable for simulations of the behavior of molecular clusters. The wealth of numerical data generated in this way enhances the need for simple and intuitive rationalizations, which were the hallmark of Pauling's work. In this contribution we intend to show that such a program can be most appropriately accomplished by dissecting the interaction between molecules into a few basic components. These components, electrostatic, induction, dispersion and exchange energies, are as important conceptually to the theory of weak interactions as the notions of ionic, covalent, and metallic bonding were to Pauling's approach to chemical bonding. We will demonstrate that any sensible modeling of intermolecular forces must rely on these four basic components. The primary concept of the theory of intermolecular interactions is the intermolecular potential energy surface (PES) [4]. The intermolecular force is defined as the negative gradient of this energy. In the quantum mechanical setting, the PES has its origin in the BornOppenheimer approximation as a potential energy for the motion of nuclei. In the case of molecular interactions, it is usually reasonable to treat the interacting species as rigid bodies in their equilibrium geometries. Such an approach provides the PES, which depends only on the intermolecular degrees of freedom. If the intermolecular forces are strong enough to cause a
666 sizable deformation of the monomers, the intramonomer degrees of freedom should also be included. The crudest approximation is to separate these two effects into the interaction of deformed monomers and the effect of monomer geometry relaxation. A rigorous approach should solve the equations for the nuclear motions on the PES, which parametrically depends on the inter- and (at least some) intramonomer degrees of freedom. At either level of complexity the ultimate task of theory is to interpret and predict the experimental measurements, such as the dissociation energies into free monomers, intra- and intermolecular vibrational frequencies in Van der Waals complexes, integral and differential cross sections for scattering experiments. In all such measurements the effect of the interaction is gauged with relation to relaxed monomers. 2. FUNDAMENTAL I N T E R A C T I O N ENERGY C O M P O N E N T S As the understanding of chemical bonding was advanced through such concepts as covalent and ionic bond, lone electron pairs etc., the theory of intermolecular forces also attempted to break down the interaction energy into a few simple and physically sensible concepts. To describe the nonrelativistic intermolecular interactions it is sufficient to express them in terms of the aforementioned four fundamental components: electrostatic, induction, dispersion and exchange energies. Typical closed-shell molecules feature an uneven distribution of charge density. This charge distribution may be described by permanent multipole moments, or in the simplest approximation, by point charges distributed over the molecules. When two molecules are far apart the moments interact via the Coulomb law, giving rise to a long-range part of the PES (the one, which decays as some inverse power of the intermolecular distance, R-n). This interaction is essentially exhibited in three different forms, primarily derived by London [5] (cf. also Refs. 6 and 7). First, it is the direct electrostatic interaction between multipole moments. Second, since the interaction perturbes the involved species, the multipoles are created and/or modified. These modifications give rise to what is usually referred to as the induction interaction energy. There is also a third type of interaction, called the dispersion interaction, that molds the long range shape of the potential. Whereas the first two interactions may be viewed as clearly related to classical electrodynamics, the dispersion energy emerges because of the quantum mechanical nature of the molecular world. Its semiclassical model, consisting of the interaction of "instantaneous multipoles" due to the fluctuating positions of electrons, may serve only as a simplified visualization. In fact, the dispersion energy is an electron correlation effect that takes place in the area between the interacting monomers. Bringing molecules closer to the Van der Waals minimum, roughly defined by the Van der Waals radii, generally leads to two additional effects. One is an alteration of all three interactions discussed earlier. Indeed, electrostatic, induction and dispersion effects can no longer be modeled by a multipole expansion. This is because the electron clouds begin to overlap giving rise to some exponential damping of the long-range interactions. This overlap brings about a secondary modification, which may easily be built into the classical models. A primary effect on the PES, however, is due to the appearance of a new, purely quantummechanical factor. It has been known under the name of the exchange repulsion or HeitlerLondon (HL) exchange energy. It is due to such quantum mechanical considerations as the delocalization of the electrons and their indistinguishability. It corresponds to "resonance"
667 integrals in Pauling's description of chemical bonds and is related to the Pauli exclusion principle. [8-11] Among closed shell molecules the fully occupied monomer orbitals repel rather than invite the electrons of the partner. This effect of blocking the electronic space has been related to such notions as atomic radii and molecular shapes - the concepts to the development of, which Pauling significantly contributed. It means that in the first approximation the atoms and molecules may be viewed as surrounded by rigid spheres or contours, which make impenetrable borders for partner monomers. In a better approximation, the shapes of monomers are soft, featuring an exponential rise and decay with R. The simplest picture of the intermolecular interaction in the complex is, thus, the following: the monomers stick together because they are glued by electrostatic forces in the form of direct electrostatic energy, the induction energy and the dispersion energy. They cannot collapse onto each other because of the exchange effect (the Pauli principle), which prevents the occupied orbitals from overlapping. This picture may be simulated in a variety of ways, from a very simple one (a direct electrostatic model restrained by hard spheres, which envelope molecules [12,13]), through combined ab initio, model and semiempirical approaches [14-16] to the most sophisticated based on the rigorous solutions of the Schr6dinger equation, in particular by means of the symmetry adapted perturbation theory (SAPT) [17-20]. SAPT, while preserving the backbone of the classical perturbation theory approach of London [5], supplements it with refinements and couplings of the electrostatic, induction, dispersion and exchange terms. This results in a beautiful theory, which blends mathematical rigor with the conceptual simplicity of common intuitions. Before proceeding any further, we will briefly outline the ab initio theory of intermolecular interactions.
3. AB INITIO A P P R O A C H TO I N T E R M O L E C U L A R F O R C E S All the information that is needed to describe a particular intermolecular interaction in mathematical and physical terms is included in the Schr6dinger equation for a system under consideration HW=EW
(1)
where H is the total Hamiltonian of the Van der Waals complex, and W and E are its wave function and energy, respectively. Unfortunately, except for a few systems, solving the Schr6dinger equation is a very difficult task demanding ingenious approximations and state-ofthe-art computational techniques. The first important approximation known as the BornOppenheimer approximation separates the motions of nuclei and electrons. Light electrons are assumed to instantly follow any infinitesimal change of the nuclear positions. This approximation brings about the crucial concept of the potential energy surface (PES) [4,15]. In general, one may visualize nuclei moving due to forces exerted on them by this potential field. In the case of intermolecular interactions, as long as monomers are kept rigid, we can view them as "moving" with respect to each other upon the intermolecular PES. In other words, the PES provides a playground for the interactions of molecules. Knowledge of a PES not only means the understanding of how two molecules interact, but also provides us with a means of simulating and predicting the properties of dimers, trimers and large aggregates of atoms and molecules.
668 The calculation of PESs is most easily accomplished by evaluating the interaction energy, Eint, which is defined as the difference between the energy of the dimer, EAB, and the energies of the monomers, E A and E B Eint = EAB - E A - E B
(2)
The procedure chosen to calculate Eint must ensure that electronic energies of the dimer and monomers are evaluated in a consistent manner [19,21-24]. It should be stressed that this requirement is absolutely crucial, as no method at present can in practice yield EAB, E A and E B energies with an absolute error smaller than Eint. Therefore, Eq. (2), which defines the interaction energy does not offer so simple a computational approach as might be expected at first glance. Two notorious inconsistencies to be alleviated in practice are: basis set inconsistency (same basis set expansion or numerical grid for A, B, and AB must be used, otherwise the basis set s u p e r p o s i t i o n error (BSSE) arises [21,23,24]) and the size inconsistency (a theory to describe AB must guarantee a correct dissociation into A and B, at the same level of theory [25]). Even if the algorithm is correct, it has the disadvantage of giving no direct insight into the nature of the interaction.Therefore, from the very beginning of quantum mechanics, scientists were inclined to solve the Schrrdinger equations through a perturbation procedure that would provide the interaction energy directly and would offer insights into its physical nature and functional form. Originally, the fundamental components related to the Coulomb interaction of permanent, induced, or instantaneous multipole moments were recovered with the aid of the Rayleigh-Schrrdinger (RS) perturbation theory [5-7]. The introduction of exchange effects occurred independently, and originated from the variational theory of the chemical bond formulated by Heitler and London. The development of a unified, rigorous treatment proved difficult because of the symmetry problem related to the indistinguishability of electrons. The classic RS formalism assigns electrons to either of two monomers, and this approximation ("polarization approximation") works as long as the monomers stay far apart [26]. In the intermediate and short-range, though, the antisymmetry of the total dimer wavefunction cannot be efficiently recovered by means of the RS perturbation theory, which becomes divergent. Attempts to deal with the increase-of-symmetry problem were eventually successful and led to a variety of SAPT formalisms [ 11,18,27]. Along with the rapid development of many-body techniques to cope with electron correlation, SAPT today provides us with a rigorous framework, as well as a detailed description of quantum theory of intermolecular forces. The details of SAPT are beyond the scope of the present work. For our purposes it is enough to say that the fundamental components of the interaction energy are ordinarily expanded in terms of two perturbations: the intermonomer interaction operator and the intramonomer electron correlation operator. Such a treatment provides us with fundamental components in the form of a double perturbation series, which should be judiciously limited to some low order, which produces a compromise between efficiency and accuracy. The most important corrections for two- and three-body terms in the interaction energy are described in Table 1. The SAPT corrections are directly related to the interaction energy evaluated by the supermolecular approach, Eq.(2), provided that many body perturbation theory (MBPT) is used [19,28]. Assignment of different perturbation and supermolecular energies is shown in Table 1. The power of this approach is its open-ended character. One can thoroughly analyse the role of individual corrections and evaluate them with carefully controlled effort and desired
669 Table 1 Decomposition of two- and three-body supermolecular (S-MPPT) interaction energies. The contents of S-MPPT terms is described and the leading SAFF terms are indicated in square brackets. S-MPPT
SAFF
Physical interpretation Two-body
AESCF
AE(2)
Electrostatic energy between SCF monomers Exchange repulsion between SCF monomers
AESCF def
Mutual polarization restrained by exchange [13ind,rJ
0 t3( disp
Dispersion energy arising between SCF monomers (2nd order)
(20).
Electrostatic-correlation energy (2nd order). Intra-monomer correlation correction to e~1~ (2) AEdet
1. D e f o r m a t i o n - i n t r a - c o r r e l a t i o n .
e(22)
['~ind,r]
2. Deformation-dispersion. re (30) j1 t disp_ind rt(2)
Ar-,exch
1
(20) 1 9 Exchange-dispersion r[Eexch_dispJ
2. Exchange-intra-correlation AE(3), AE(4), etc.
Higher-order correlation corrections as for AE(2) Three-body
AE SCF
eeI~xch F
1. SE component: single exchanges between monomers 2. TE component: all monomers are involved in the exchange - (20)
e(30)1
SCF-deformation nonandditivity 113ind,r, ind,r I - (20)
1. Exchange-dispersion nonadditivity [Eexch_disp] 2. Exchange-intra-correlation nonadditivity
AE(2)
e(22) 3. Deformation-intra-correlation nonadditivity [Wind,r]
4. Deformation-dispersionnonadditivity rE (30) 1j t disp_ind AE(3 )
..(30) e-disp
1. Dispersion nonadditivity, accounts for triple-multipole terms 2. Higher-order correlation corrections as for AE(2)
AE(4), etc.
Higher-order correlation corrections as for AE(3)
670 precision. The price is involvement of many arbitrary choices and approximations, which at some point may render the method cumbersome and subjective. A proliferation of corrections with an increasing order of the perturbation treatment and additional approximations to circumvent a divergent character of theory may pose problems. Another shortcoming is the limitation to intermediate and long-range interactions. Fortunately, all these drawbacks may be overcome by using the relationship to the supermolecular energies [19,28]. The supermolecular interaction energies defined in Eq.(2) have the advantage of being free from such problems and represent the most reliable interaction energy values for adjusting the final potential. Only a concerted use of the direct supermolecular approach and SAPT may lead to very accurate interaction energies. It also leads to a deeper understanding of the origin and behavior of the components to the interaction energy. Below, we will describe some very recent advances.
3.1
Exchange repulsion versus molecular shape
"The Dutch physicist J.D. van der Waals found that in order to explain some of the properties of gases it was necessary to assume that molecules have a well def'med size, so that two molecules undergo strong repulsion when, as they approach, they reach certain distance from one another. [...] It has been found that the effective sizes of molecules packed together in liquids and crystals can be described by assigning Van der Waals radii to each atom in the molecule. The Van der Waals radius defines the region that includes the major part of the electron distribution function for unshared [electron] pairs." Cf. Fig. 1.A [2]. Strong repulsion described by Pauling in his "General Chemistry" textbook [2] originates from the electron exchange effect that arises between the electron clouds of interacting monomers. The quantum mechanical approach allows us to link the concept of the effective size of molecules or a molecular shape to the exchange effect. We may visualise this shape in terms of the exchange repulsion, which is sensed by a rare gas (RG) atom moving around a molecule. Several researchers have attempted to define alternative pictorial representations [29,30]. In Figs. 1.B and 2 we show the representation proposed by our group [31,32]. In the displayed diagrams, a contour is drawn in polar coordinates that were used to define the motion of a probing RG atom around the molecule. However, the distance R is replaced by the value of the exchange-repulsion energy. In that manner, the contour exhibits local decreases and increases of repulsion, which indicate local concentrations and depletions of electron density in the diffuse region. Several examples of such plots for different molecules are in Figs. 1.B and 2. Let us examine the contour for the C12 molecule in Fig 1.B. How does this compare with the drawing in Pauling's book? Both pictures reveal depletions in the middle of the C1-CI bond. However, the Van der Waals radii approximation fails to predict a characteristic flattening of the electron distribution at the chlorine ends along the internuclear axis. One can see in this and all other drawings that the assumption of a spherical distribution around an atom within a molecule is not justified. The deviation from a spherical distribution may be small, but it is sufficient to influence the shape of the PES, and to determine the equilibrium structures of Van der Waals complexes with non-polar species for, which induction effects are negligible. The PES of RG-C12 exhibits two kinds of minima, one for a T-shaped and one for a coUinear from. The PES for C1F has three minima, and two of them (at the C1 side and the T-shaped one) are due to a reduction of the repulsive effect, cf. Fig.2. The global minimum corresponds to the collinear form RG-C1F. The PESs of RG-HF and RG-HC1 systems feature two collinear minima, and it is clear from Fig. 2, that the one at the
671
Figure 1. A) A chlorine molecule, illustrating the difference between Van der Waals radius and covalent radius (from Pauling's book [2]). B) The exchange repulsion contour of C12, obtained for the He-C12 complex, and defined by two polar coordinates, measured from the center of mass: (Energy, O). The contour is the image of the chlorine molecule shape, detected by a rare-gas atom [32]. C) Relief map of the negative Laplacian of the charge density, -V2p, of the chlorine molecule. One can notice three depletions of the electron charge density: in the region perpendicular to the bond, and along the interatomic axis, for the colinear approach at the both ends.
672
F
F
H
Ct
C
Cl
0
H
Figure 2. The exchange repulsion contours for several molecules, obtained for interactions with rare-gas atoms, and defined by two polar coordinates measured from the center of mass: (Energy, O) [31,32]. The contours are the images of molecules' shapes, probed by structureless atoms. In contrast to plots that show isoenergetic regions, these contours reveal an enhanced anisotropy. Convex and concave regions indicate, respectively, the areas of increased and reduced exchange repulsion.
673 halogen atom end must be due to a reduction in repulsion. The RG-CO complexes are skew T-shaped, again in agreement with the depletion visible in Fig 2. In general, any non-polar species (such as methane, methyl group, N 2, etc.) will be preferably hosted in niches within the exchange repulsion, which are displayed by these contours. Finally, it is important to add that the regions of reduced (or enhanced) repulsion may be directly related to the regions of depletion (or concentration) observed in the diffuse part of the Laplacian of electron density plots proposed by Bader [33]. You may see this in Fig.2, where the negative of V2p for C12 has been drawn. In such a way, scrutiny of-V2p indicates the locations where we may expect reduced repulsion, and which are thus the most favorable sites for a nucleophilic partner to attach. It is important to stress that the analysis of the Laplacian must carefully distinguish between the relative charge concentration in short range and the relative charge concentration in the diffuse region - as they sometimes do not coincide, and only the latter is relevant. In quantitative modeling of PESs the description of the molecular shape as a superposition of atomic components remains an attractive approach, but it is clear from the earlier discussion that it must be extended to accommodate two important factors. The atomic shape is not a rigid, but rather a soft, exponentially decaying electronic charge cloud. In addition, it should be anisotropic with the anisotropy depending not only on the atom itself, but also on its partner in the chemical bond. 3.2
Dispersion as the intermonomer correlation effect The traditional view of dispersion effect introduces the notion of instantaneous multipoles. Such multipoles arise, when for an infinitesimal portion of time, electrons are frozen in their positions, and thus become oriented with respect to positively charged nuclei. A momentary multipole at, say, monomer A, is created, which induces a momentary multipole at monomer B. The resulting electrostatic interaction defines the dispersion effect. In this way fluctuations of negative electron charges of A become correlated with the electron charges of B. It is clear that this effect may be related to the intermonomer electron correlation phenomenon. This simple picture, however, is reasonable as long as we assume that the electrons of A and B do not mingle with one another, or more precisely, do not penetrate each other's occupied space. Mathematically, it means that the condition ra+rb
674
X2
b)
a)
4
He1 _
Xl
~
- 2.8
;
He z -
' ~
i
"-----"
'
"-~','~
2 "
"
.'.'/1
"~ " ,
'"""'
I o !- ', ",'" :-7,1~\~'"////// I ' , ' _ " > " ",;_/t~J// II \'. '_, ,' ,. .' ,, -_ ~' ." ~; '" /I \ ~ S / /
" 2.8
OiO
"
i
X2
li
I
9
9
"if///// ~/// /
!////
ill
," ~!111 \ ~
lilli
I
.
-
-4
x2
-2
0
2
0
2
4
Xl
x2
(''~ -~1
,
i
'
IIlll
d)[/~1/i;"
i
"''"""
4
Illl
~'~ ,e.~.,t,1 m ,,9' :,,, ,,,.. Ii I ~ .
_
.,:;:":
Ul
~
I', , ',,~ '-',,:I I ~ , ,'.', ,,.!41 _1 ', ', ,:....':~!~ hi-. t t ~ "dill'
#11 I I ! I
"\'..':: ~I<~ "-''
2
l~
/ -4
, -4
, -2
~
, 0
x 1 -4~ 2
4
~ ".''dill
',," ,'7It [( -4
i -2
4
X1
Figure 3. Plots of the dispersion function for the equilibrium He dimer from Ref. 36. The electron coordinates, x 1 and x 2 are defined in fig. (a). The accurate dispersion function (calculated with Gaussian geminals) is shown in plot (b). If the dispersion term is reduced to the dipole-dipole component, one obtains plot (c). If every electron is allowed to use the complete dimer basis set as well as the bond functions one obtains plot (d).
675 correlation will now have to account for a sizeable contribution from the "cusp": a singularity predicted by the Coulomb law when two electrons occupy the same point of the configuration space. Correlated, but non-overlapping multipoles, are not capable of describing this effect properly. In Fig. 3 depicting two interacting He atoms, we show the dispersion ansatz to the electronic wavefunction: exact and reduced to the dipole-dipole interaction [36]. One can see that whereas the dipole-dipole interaction reproduces the general pattern of this ansatz (its "butterfly shape") it fails to reproduce a steep cliff, which arises in the region where the electron coordinates coincide. The related portion of the dispersion energy must be evaluated by some method, which does not refer to the multipole approximation. In addition, the rigorous approach should use explicitly correlated basis functions centered not only at the nuclei, but also off them, in particular along the Van der Waals "bond" line. In many more approximate calculations, however, it is enough to supplement a reasonable basis set (that is, the one, which properly describes the first few moments and polarizabilities of the monomers) with some small set of functions located in the mid-bond region [36-38,19]. They are termed bond functions. The bond functions efficiently account for the cusp in the region where it is the most important for inter-monomer correlation. The effect of bond functions is also shown in Fig. 3. One immediately notices a considerable improvement in the cusp region.
3.3
Induction, charge-transfer and SCF deformation
At first glance the induction energy is a very "classical" term, and conceptually a simple one. A multipole moment on A induces another multipole moment on B, and eventually they interact electrostatically. Different mechanisms behind induction interaction in a variety of molecular settings were elucidated and composed into an elegant theory [7]. It is this simple, however, so long as the condition of applicability of multipole approximation, ra+rb>R holds (cf. our discussion of the dispersion interaction). Not only does it keep the electrons of A away from the electrons of B (thereby obeying the Pauli exclusion principle), but also they are screened off the nuclei of the partner molecule (thus avoiding the singularity due to coinciding position of an electron and a nucleus). At close range, however, electrons of A mingle with B, and since the Pauli principle between monomers is not imposed, they may deeply penetrate each other's occupied space. There is nothing to prevent an energetically beneficial charge flow from one monomer into a manifold of symmetry-forbidden states of another monomer. Not restrained by the Pauli principle, such a flow may result in multiply occupied spinorbitals of this m o n o m e r - a process dubbed "unphysical charge transfer" [22,39]. A good example of the unphysical charge transfer was provided for the He-Li + interaction [39]. The induction interaction leads then to a transfer of the electrons from Li + to the occupied space of He. In effect, the induction interaction in the minimum region, although a well defined quantity, describes some huge, unphysical interaction. A proper remedy is enforcing the Pauli exclusion principle during the course of induction interaction, as in the routine supermolecular calculations for the dimer. The resulting induction effect is then accompanied and restrained by the exchange effects, and termed the SCF-deformation energy. However, dissecting the SCF-deformation energy into the induction and exchange-induction contributions would be largely dubious because, if the induction alone is not a physical quantity, neither would be the accompanying exchange effect. In addition, both are much larger than the SCF-deformation effect. From the preceding discussion it is clear that the induction energy may function as an approximation to the SCF-deformation, but not on its own. Therefore, one should carefully
676 monitor the legitimacy of this approximation in every case and for every geometry of a complex. Although a rigorous extracting of the classical induction effect cannot be performed, the induction multipole expansion may still be useful as a template for modeling the SCFdeformation effect and its calibration. Another consequence is the impossibility of a rigorous separation of the component termed "charge-transfer" energy. This type of energy appeared in early partitioning treatments of the interaction energy [40]. The charge-transfer effect referred to that part of the induction effect, which required a description of the charge cloud modification at A, not only with orbitals of A (that was referred to as the "true" induction), but also with the orbitals of B. In the language of the Valence Bond theory the latter would correspond to the ionic structures, e.g. A§ -. In fact, even for so called extended basis set calculations the induction component of the total electronic wave function is not efficiently reproduced unless "exchange" and "ionic" terms are allowed for. However, an obvious problem with this definition is that in the real dimer the borders of monomers are not defined and it is impossible to distinguish, which charge cloud deformation is already a charge-transfer and which is just a simple induction effect. One can do it for finite basis set treatments, but such a definition is strongly basis set dependent, and the "ionic" and "exchange" components of the wavefunction vanish in the limit of a complete basis set. These ambiguities led some researchers to propose that monomers in the dimer are defined through orthogonal orbitals. Then, however, the charge-transfer contribution grew very large and became a dominant contribution, absorbing portions of the electrostatic and exchange components [41]. In view of these problems the question is whether we really need a separate charge-transfer term. In a rigorous quantum mechanical treatment the induction, charge-transfer, and exchange effects are blended together in the SCF-deformation term and in the correlation corrections to it. Modeling is usually based on multipole expansion of the induction energy, restrained by exchange and penetration effects. This is a legitimate approach since asymptotically the deformation effect is correctly described by the induction multipole expansion. Some researchers, however, attempt to include a separate charge-transfer term, and argue that this results in a better numerical fit [42]. Indeed, ab initio calculations are often slowly convergent if ionic and exchange components are left out, and a similar problem may pertain to modeling. Despite all the aforementioned problems, the role of a separate CT term in the modeling of a potential should be further investigated.
3.4 Example 1. Ar-CO2: dispersion bound complex One reason for studying the RG-molecule complexes is that a RG atom serves as a structureless probe of the interaction proclivity of a molecule. In addition, such complexes are simpler than molecule-molecule interactions since they lack a strong long-range electrostatic factor, which may obscure other energy components. Not surprisingly, the RG-molecule systems provide convenient examples for studying weak intermolecular interactions for both the theory and the experiment. To describe the composition of the PES from well defined and physically sensible fundamental parts, we briefly describe our recent results for the Ar-CO 2 complex [43]. The coordinates for this complex are shown in Fig. 4.A The geometry of CO 2 was kept rigid. The interaction energy perturbation components are also shown in Fig. 4, for R = 3.7 A. The dominant short-range part, the HL-exchange energy, l~exch filL) has a strong angular dependence with a minimum at 90 ~ and maxima at 0 ~ and 180 ~ A wide niche in the HL-
677
A)
E, pE h 20000
'
'
I
'
_- ~e(HL) 15000 ~-.. ~, exch
'
i
'
HL
1ooOO ooo
'
i
'
'
i
'
c o [O"~'~.... R .... Ar
'
I
/
1
/ -.54
~_AEScF
o~
'
~/ . ~
II
-~" <~-X---AE
'
~.-~..•
~- -x.._
---=5000 -10000
~'" F
1 "
,
E(: 0> I
0
,
E}20,>r ,
30
)
60
,
,
i
,
90 O, deg
,
I
,
,
I
120
,
,
180
150
B)
E, pE h 8000 ~ x ~ 2 ) 6000 4000 ~ . 2000
f
-/~i .,~....AE(Z)
117"~~.
-2000 F-
exch
. . o ~/ .-- .0~ *
-4000 ~ - ' ~ / o..
-6000 L , / ~ 1 7 6 0 30
,0
8(20)
/
_.---]
../-'/'~"
-J
~,~.. 0
q |
<>\ ~,,'" '-~ o
, , , , , , i , , ,-o 60 120 150 90 O, deg
,..~ 180
Figure 4. The system of coordinates and angular dependence of the interaction energy components in ArCO2 at R=7.0 ao; A) at the SCF level of theory, B) at the correlated level of theory.
678
B) 1250 1200
HL F_.,
E[~E] h
_
'
'
'
'
I
'
-
'
'
'
I
'
'
'
'
exch
I
'
Ar
'
'
'
I
'
'
'
'
I
'
'
'
'
z>
i
1150 1100
1 I'
O
1050 1000
,
-3
,
,
,
C
I
-2
,
,
~
,
I
-1
,
,
,
,
I
O
,
0 A[A]
,
,
,
I
1
,
,
,
,
I
2
,
-
,
,
,
3
Figure 5. A) Relief map of the negative Laplacian of the charge density, -V2p, of the carbon dioxide molecule. filL) as a function of A, the B) Repulsive effect of the HL-exchange term, •exch, coordinate of Ar atom moving along the molecular axis at R=3.75 A in a parallel fashion.
679 exchange plot at 60~ ~ is observed. As pointed out in Sec.3.1 [19] such regions of reduced repulsion are directly related to the electron density depletions in the outer, diffuse region of a molecule. The plot of the Laplacian of electron density [44] (cf. Fig. 5.A) indicates a depletion around the central C atom. At the same time, an Ar atom moving along the O=C=O molecule experiences a distinct reduction of exchange repulsion fight in the middle, Fig. 5.B. Towards the O atoms, beginning at 40 ~ the HL-exchange rapidly rises. Of particular interest is that no additional depletions or concentrations of electron density appear at the outskirts of the terminal O atoms, i.e., there is no evidence of the lone pairs at the O atoms. This is, again, in accord with the plots of Laplacian of electron density in Fig. 5.A. The anisotropies of the electrostatic (e~ls~ induction
- (20)) [eind,r, terms
9 SCF.
SCF-deformation (Z~def) and perturbation
are quite similar and qualitatively reciprocal to the anisotropy of the
HL-exchange energy, see Fig.4.A.
e~ls~ is reduced for this complex to the penetration part
only and has no long range component, ~ind,r' o(20) which includes the interaction of the quadrupole and higher CO 2 moments with induced moments of Ar (restrained by overlap effects and truncated at the second order), represents a poor quantitative approximation to
_SCF Al~de f .
This is
9 SCF
not unexpected since Al~def also encompasses exchange effects that prevent the Pauli principle violation in the course of interaction. The correlation components are shown in Fig 4.B. AE(2) is dominated by the dispersion ..(20) The electrostatic-correlation component, e~l,2r~ is practically negligible. The term e.disp. _(2) accompanying exchange corrections included in AEexch are small but not negligible in the vicinity of O atoms.
3.5 Example 2. Water dimer: introducing electrostatics The water dimer is the most important H-bonded system. The major attractive contribution to the interaction energy of the water dimer is the electrostatic effect. It dominates over other attractive terms, such as the induction and dispersion energies, and it is the most anisotropic. To discuss the properties of the fundamental components in the water dimer case we chose to demonstrate the angular dependence of various terms in the dimer geometry derived from the cyclic configuration of a trimer (see Fig 6). SCF HE e~10) and ~ d e f , are shown in The components of the SCF interaction energy, eexch, Fig.6. It is clear that the anisotropy is determined by the electrostatic contribution. This term generates both a minimum for the H-bond geometry and the barriers for the H-to-H and O-to-O configurations. The exchange energy behaves differently. It is greatly enhanced at the H-to-H structure, from which it gently rolls down as ct increases to reach a broad minimum in the region of the O-to-O form. The effect of the exchange term on the total anisotropy is consequently small, although quantitatively the attraction of the electrostatic energy is considerably reduced. The correlation contributions are of secondary importance. They primarily include the weakly anisotropic dispersion term (cf. Fig. 6) and a smaller but much more anisotropic electrostatic correlation term. In the PES modeling, the latter may be safely absorbed in the total electrostatic interaction. In the above description we need no reference to lone electron pairs. Nevertheless, one 9
680
E, m E
O-to-O
H-bond
H-to-H
h
40 Z ~ scF
30
,s
--.
20
"
e"~ e$
,
"
. . . . . . . . .3. . .0. .
_
j 10
,,~
9
0~,'~-~
% e (~2) ~ m B m m m ~
)
]~" 9
. .,~. ~
~ , . a,,~
A 1:;'ScF
a, .tY,- -
atsp
"-'~-" def
-20
,
0
,
m
% 1
30
,
,
1
.~, ,
,
60
I
90
,
,
I
120
1 150
(~, deg
Figure 6. Angular dependence of various two-body interaction energy components in the cyclic planar H20 trimer at R=3.0 ]k.
681
Figure 7. A). Relief map of the negative Laplacian of the charge density, -V2p in the plane perpendicular to the molecular plane [33]. -V2pexhibits two local maxima and two saddle points. The maxima may be identified with localized lone electron pairs. B) Electrostatic equipotential contours for the water molecule, in the plane perpendicular to the molecular plane [ 13]. The contour spacing is 2000 cm-1 (for a unit test charge), and the distance units for the horizontal and vertical axes are atomic units (au). C). The exchange repulsion contour of H20 derived from the He-H20 complex at R(He-O)=3.5 ,h,, defined by two polar coordinates: (Energy, o), and drawn in the same plane as the Laplacian. The contour is the image of the water molecule shape, detected by a rare-gas atom [31]. The regions of lone electron pairs are indicated with arrows, but no apparent sign of their presence is observed. The lone pairs electron concentrations are not diffuse enough to show up in the van der Waals minimum region.
682 may see local concentrations of the electron density in this region in the Laplacian of charge density plots, cf. Fig.7.A [33]. Similarly, one can notice local maxima of the electrostatic field along the directions of the lone pairs Fig.7.B [13]. On the other hand, no distinct local maxima are seen on the exchange-repulsion contour, cf. Fig.7.C [31 ]. The reason is that these features are fairly short ranged, and thus may affect the directionality of interaction only at shorter intermolecular separations. For example, in the water dimer the O--O distance is relatively small, and the familiar tilted structure of the dimer is rationalized by the OH group of one monomer pointing towards the lone pair of the other. On the other hand, when the moieties are farther apart, as in the Na+...water complex, the cation attaches itself along the C 2 axis of water and not along one of the lone pairs. A rare gas atom also seems to take little notice of a lone pair, cf. Fig.7.C. 3.6 General considerations The two preceding examples of the ArCO 2 complex and the water dimer well illustrate some general trends in shaping the PES following the behavior of its fundamental components. The short-range shape is always molded by the exchange-repulsion, primarily described by HL eexch. The long-range part of the PES is expected to follow the shape of the dispersion term if t;disp _(20) is the primary binding factor. If the complex binding is dominated by the electrostatic or induction interactions, then they assume the role of determining the long-range behavior. Interestingly, the anisotropies of all these fundamental components are only weakly dependent on the intermolecular separation - unlike the anisotropy of the total PES. The shapes of the long- and short-range parts of the PES are thus easily predictable. This is often not the case in the Van der Waals minimum region where we encounter a delicate balance of all components. The geometry of the global minimum of a Van der Waals complex may sometimes be difficult to a priori postulate unless quite accurate calculations are performed. In a majority of cases, however, the picture is simple. The dispersion term displays a relatively weak orientation dependence, whereas the exchange and electrostatic terms (and induction to some extent) are strongly anisotropic. Consequently, in the absence of long range electrostatic interactions the equilibrium geometry is determined by the exchange anisotropy. This is the case of the previously discussed ArCO 2. It is different, however, from interaction of ionic and polar species. If the electrostatic interactions dominate, their anisotropy determines the shape of PES in the minimum region. The latter fact was successfully exploited by the electrostatic model of Buckingham and Fowler [12] and other, similar models [13]. Between these two extremes one may encounter more complex combinations. For instance, the minima on the PES of rare gases with polar molecules often follow the anisotropy of the induction component. The interaction between a RG atom and a polar molecule such as HF can serve as an example [19].
4. MODELING OF PES AND ITS COMPONENTS There may be many different functional representations of PESs based on a variety of mathematical techniques (e.g. polynomials or splines) and different physical models (e.g. atoms-in-molecule approach). Theoretically, all of them should work adequately in simulations of scattering, spectroscopic or thermodynamic properties. However, we believe
683 that their usefulness can be greatly enhanced when a high numerical accuracy of the interaction energy is combined with a sensible representation of the fundamental components. Only then can we combine a full understanding of the nature of the interaction at the molecular level with the predictive power of the spectroscopic, scattering, and thermodynamic theories. The preparation of such potentials will be demonstrated using again ArCO 2 and (H20) 2 as examples. 4.1 Ar-CO 2 The total potential energy was expressed as a sum of the fundamental components, the short-range repulsive term (Vsr), the dispersion energy component (Vdisp) and the induction part (Vind) [43]
V = Vsr + Vdisp + Vind
(3)
The fourth term, the electrostatic interaction, was omitted since one monomer, the Ar atom, has no permanent multipole moments. The Vsr part accounts for the monomers' shape factor in the potential. It mainly includes the exchange effects. However, other effects, such as the overlap-dependent electrostatic interaction, may also be conveniently absorbed in Vsr. Following Buckingham, Fowler and Hutson [14], Vsr may be adequately modeled by a Born-Mayer exponential form: Vsr(R, O) = A exp[-~(O)(R- Rref(O)) ]
(4)
where Rref and 1~are expanded in the Legendre polynomials PL(cOsO)
Rref(O) = ~ Rref,L PL(COSO) L ~(0) = ]~ ~L PL(COSO) 9 L
(5) (6)
This form of Vsr is particularly efficient for rod-shaped, elongated molecules. For such molecules a direct expansion of Vsr in terms of PL(COsO) would have a prohibitively slow convergence because of the strong radial-angular coupling of the PES. However, the Rref as a function of O effectively takes care of this problem through a relatively short expansion within the exponential function argument. The dispersion part of PES was represented using the following damped expansion:
r.,disp PL(COSO) even n-4 Dn(~,R)~n,L Vdisp =- ~ ~ Rn n=6 L=0
(7)
where ~disp "-'n,L are the Van der Waals dispersion coefficients, which are defined by the multipole expansion of the dispersion energy through n=10. Since the monomer electron distributions overlap, the resulting quantum mechanical effect is accomplished by the damping functions D n designed by Tang and Toennies [46]
684
Dn([3,R) = 1- exp(-~R) ~ [~(O) R] k k~ k=0
(8)
Finally, the induction interaction Vin d was cast into a similar form as the dispersion part with t-,ind "-'n,L describing the interactions between permanent and induced moments of particular ranks. e~n ~4 r Vind = -
PL(COSO)
~n,L n - 8 L--0
Rn
(9)
In fact, in complexes like ArCO 2 where multipole interactions are fairly weak, the induction interaction is significantly affected by the short-range exchange effects whose role is to prevent the Pauli-principle-violating redistribution of electrons. These effects are related to the molecular shape factor and may be absorbed in Vsr. The described overall procedure and the final form of the potential are applicable to any complex of an atom and a linear molecule. They also may be generalized to more complex systems, composed of polyatomic monomers, as exemplified in the next Section by the water dimer. How efficient is the described representation of the ArCO 2 potential? To answer this question the above PES along with a few empirical potentials have been used to derive a number of properties, such as the ground vibrational state and dissociation energy of the complex, ground state rotational constants, the mean square torque, the interaction second virial coefficients, diffusion coefficients, mixture viscosities, thermal conductivities, the NMR relaxation cross sections, and many others [47]. Overall, the ab initio surface provided very good simulations of the empirical estimates of all studied properties. The only parameters that were not accurately reproduced were the interaction second virial coefficients. It is important that its performance proved comparable to the best empirical surface 3A of Bohac, Marshall and Miller [48]. This fact must be greeted with satisfaction since n o empirical adjustments were performed for the ab initio surface. 4.2
Water
dimer
As a second model potential we shall briefly discuss the PES for the water dimer. Analytical potentials developed from ab initio calculations have been available since the mid seventies, when Clementi and collaborators proposed their MCY potential [49]. More recent calculations by Clementi's group led to the development of the NCC surface, which also included many-body induction effects (see below) [50]. Both potentials were fitted to the total energy and therefore their individual energy components are not faithfully represented. For the purposes of the present discussion we will focus on another ab initio potential, which was designed primarily with the interaction energy components in mind by Millot and Stone [51 ]. This PES was obtained by applying the same philosophy as in the case of ArCO 2, i.e., both the template and calibration originate from the quantum chemical calculations, and are rooted in the perturbation theory of intermolecular forces. Compared to the ArCO 2 complex there is an important new f a c t o r - the electrostatic
685 energy. A water molecule possesses sizeable dipole and quadrupole moments, and thus the electrostatic energy provides the dominant attractive contribution. Since the electrostatic interaction is strongly anisotropic, the orientational dependence of the interaction energy is governed by this term. The nonspherical symmetry of both monomers introduces an additional difficulty. The functional forms of the individual interactions should account for the mutual orientations of two water molecules. Customarily, two approaches are used to incorporate this angular dependence. The first distinguishes the single interaction centers at each molecule, which are connected by the intermolecular vector R=[R,c0], where co specifies its orientation. The orientations of the molecules in the space-fixed coordinate system are described by the two sets of Euler angles {tOa} and {COb}. If the multipole expansion within such a single-centered approach is to be convergent, the already mentioned convergence criterion, ra+rb
Electrostatic energy In the water-dimer potential of Ref.[51 ], each water molecule has three interaction sites centered on atoms. The long-range electrostatic energy assumes the functional form Ves=~., ~., QatTatubQub
(10)
a~A b~B
It includes the interactions of distributed multipole moments Q (up to a quadrupole) labeled t and u. The T matrix provides the Coulomb energy appropriate for particular multipoles and includes the distance between sites a and b and their relative orientations. The short range (penetration) component of the electrostatic energy, in a manner similar to the Ar-CO 2 case, can be absorbed into the exchange repulsion term.
Short-range e n e r g y The exchange-repulsion energy has been fitted to the following functional form [51,52] Vsh = E
E
exp{'tXab[Rab-Pab(~)]}
(11)
a~A b~B
with Rab the distance between sites a and b, ~ab a hardness parameter depending on the pair of sites, and an orientation-dependent parameter describing the effective size of atoms. The orientation dependence of Pab(f~) is given by a simple model that assigns to each site a shape described in terms of orientation-dependent radius p
686
pa(O,*) = Z P~k Clk(O,*) l,k Pab(~) = Pa(~) + Pb( ~ )
(12)
where Cm(O,.) are renormalized spherical harmonics and P~k the related expansion coefficients. Dispersion energy A reasonable multi-site representation for the dispersion energy was developed by Szcz~niak et al. [53]
Edisp = Z
Z
fab(R)C~b PCa6
aeA beB tl forR>ra+rb } fab(R) = Iex~ ra+rb R )12 forR
(13)
where C are Van der Waals site-site coefficients and f is a damping factor, ra and rb are the Van der Waals radii. The C coefficients were obtained by fitting to the second-order nonexpanded dispersion term. Since dispersion has a weaker anisotropy than other terms, it is also quite sufficient to use a single-centered expansion
10 Cn(ll,12,j ;kl,k2) ~klk2 Vdisp = - ~ Z Rn S1112j ((0a,0)b,fD)fn(a) n=6 11,12,j,k1,k2
(14)
where R is the distance between the centers of mass, Cn(ll,12,j;kl,k2) are the anisotropic ~klk2 dispersion coefficients, Sl112j (0~a,~b,C0) are the orientation functions first proposed by Stone [54], which include products of Wigner rotation matrices and 3j coefficients, fn(R) are damping functions of Tang and Toennies [46] (see Eq.(8)). The coefficients Cn(ll,12,j;kl,k2) have been derived by Rijks and Wormer [55]. The multi-center form of the dispersion term is more in the spirit of the other terms, and despite its simplicity proved to perform better in some applications, e.g. in calculations of tunneling splittings between nonsuperimposable forms of the water dimer [56,57]. Induction energy To describe the induction interaction the single-center approach with one-site polarizabilities was found to be the most efficient. Vind has been expressed as
Vind=21---~ ~ AQt~mb fn(Rab)l/2Qb A B#A
(15)
687 and AQtl = - Z
b b a'a Ttu ab fn(Ra,b)l/2 (Q~+AQu) (gt't
(16)
B;eA where T is the interaction function discussed in the context of Eq.(10), Q are multipole moments, and fn(R) are the damping functions. A distributed multipole moment Q at b induces multipole moment AQ at a, which corresponds the polarizabity tensor ot~:~. Due to its coupled form (i.e. each induced moment depends on the induced moments on all the other centers), the Eq. (16) should be solved iteratively. The missing element in this model is the accompanying exchange effect. As mentioned in the preceding section, this effect may be partly accommodated in the short-range repulsive component. Nevertheless, the modeling of the SCF deformation energy is still not quite satisfactory. The potential for the water dimer that was just described is not yet very accurate. Since, however, both the functional form and calibration were derived from ab initio calculations, there is room for well controlled improvements that would follow future more accurate ab initio data. This model provided the geometry of the global minimum in very good agreement with experiment, and a fair account of the second virial coefficient. It should be mentioned that the well-depth of this potential is smaller than the experimental value of 5.4 + 0.7 kcal/mol. However, the best ab initio calculations on the water dimer also consistently predict smaller values. This potential was also used in a number of bound-state calculations on the water dimer and trimer using either variational basis set approach [56] or the rigid-body diffusion Monte Carlo (RBDMC) method [57]. The predicted tunneling splittings in the dimer were in very good agreement with the experimental findings. Upon augmenting it by the three-body induction contribution this potential was also used by Gregory and Clary in the RBDMC simulations of the tunneling dynamics in the water trimer [58], tetramer, and pentamer [59].
5. TRIMERS AND NONADDITIVE EFFECTS So far, our discussion of intermolecular potentials has been limited to pair interactions. In clusters involving more than two monomers the three- and higher-body terms will appear. For example, the total energy of a trimer ABC may be expressed as follows
E(iAB)C =
Z E~) X=A,B,C
+ Z
AEO) X=A,B,C
+
Z
AE(~+/.amAB CArt(i)
X>Y=A,B,C
(17)
where (i) denotes a particular level of theory, e.g. Hartree-Fock (HF) theory, an order of Moller-Plesset perturbation theory, or any other size-consistent treatment of correlation effects, such as Coupled Cluster (CC) Theory. The second, third and fourth terms describe, respectively, the one-, two-, and three-body contributions. The one-body term describes the effects of the geometry relaxation of the subsystem X in the trimer. A two-body term Art(i) term AE(~ describes the pairwise interaction between two monomers, and the z-U~ABC
represents the three-body contribution arising between the relaxed-geometry monomers
688 arranged in the same way as they occur in the complex. Many-body effects are known to affect directly and indirectly various bulk properties of liquids and solids. Binding energies of rare-gas crystals indicate 10% deviations from additivity [60] and are believed to be responsible for different lattice structures than expected from pairwise additivity [61,62]. Many-body effects are particularly important when hydrogen-bonds, ion-polar, or ion-ion interactions are present. For example, the simultaneous description of structural and thermodynamic properties of the three phases of water was found impossible without the explicit consideration of many-body contributions [63]. In ion-water interactions, the nonadditive effects were found to alter the coordination numbers of solvated ions [64], and structures of solvation shells [65]. They also proved to be a crucial binding factor in the beryllium clusters [66] and quartet state of alkali metal trimers [67]. Finally, in some alkaline earth-halogen crystals the induction nonadditivity has been linked to unexpected, layered crystal structures [68]. Although investigations of bulk properties offered a great deal of evidence of the importance of many-body effects, it is the study of clusters, which in recent years have led to a better understanding of these effects. Many-body interactions are much more difficult to retrieve from experimental data or to compute directly by ab initio methods than the pair interactions. Discerning different threebody terms may be accomplished within the framework of SAPT [69,70] and the relationship to supermolecular interaction energy terms has also been obtained [19,71] (cf. Table 1). Similar to the binary interactions, the nature of nonadditivity may be explained in terms of the fundamental nonadditive components: induction, dispersion, and exchange; the fourth one, the electrostatic energy, is always additive. While the absence of the electrostatic part reduces the number of elementary terms, the fact that in the trimer there are three different pair interactions leads to a greater variety of the mutual couplings of fundamental components than in the case of binary interactions. For example, in addition to a pure three-body induction term, there are also mixed terms, such as induction-dispersion, exchange-induction, or exchange-dispersion, which correspond to situations when one pair is involved in one type of fundamental interaction (e.g. induction) and another pair in a different type (e.g. dispersion) [ 19,71,72]. With the advances in cluster spectroscopy the field of Ar2-chromophore clusters has become a convenient laboratory for experimental and theoretical investigations of three-body interactions [73]. Due to the presence of a chromophore these complexes can be studied by IR or microwave spectroscopic techniques. Theoretical techniques are used to predict these spectra by assuming some form of the three-body interaction potential. This approach requires accurate two-body potentials to retrieve the three-body effects from the experimental data. Unfortunately, the need for accurate pair potentials greatly restricts the number of useful chromophores. At the time of this writing only a few pair potentials qualify for this label, the most important being those for: Ar-Ar, Ar-HF and Ar-HC1. For this reason, the related Ar 2HX clusters have been studied in great detail. Owing to the synergistic interaction between theory and experiment, the nature of nonadditivity is close to being well understood in these clusters.
5.1 Ar2-chromophore clusters: exchange and dispersion nonadditivity The three-body exchange nonadditivity, Eel~xch,is conveniently categorized as related to two general mechanisms depicted in Fig.8. The first, where permutation of electrons encompasses all three monomers and the second, where only two are involved. They are referred to as
689
1-e exchange
V
$
B
C
SE
TE
Figure 8. Diagrams representing the Heitler-London exchange terms, single exchanges (SE) and triple exchanges (TE). The electron exchange operators are symbolized with arrows and the interaction operator with a dashed line.
Ar
Ar
,
i
! i
,
! !
", : ,' ' l l !
I !
| ~ H ~ e.o.m X ~ 1~
~
H
' ! I
i ! !
I ! ! I !
! I !
! !
Ar
Ar X - F , C1
Figure 9. Definition of geometrical parameters in Ar2HX (X=F,C1). R is the distance between the center of mass of HX and the middle of Ar 2. O is the angle between the R vector and the HX axis. ~ is a dihedral angle between Ar-Ar and HX axes.
690 triple-exchange (TE) and single-exchange (SE) nonadditivities [72].
SE nonadditivity The exchange repulsion within dimer AB may be viewed as a distortion of charge densities of A and B; the electrons are pushed outward and effectively create an "exchange quadrupole." (This electrical effect will be the most pronounced if A and B are rare-gas atoms). Next, the "exchange quadrupole" interacts with the permanent moment of C, giving rise to the nonadditive effect [74,75]. In modeling, the SE nonadditivity is characterized by a different functional dependence with respect to different pair separation distances [72,75]. It has longrange dependence with respect to multipole interactions, but short-range exponential decay for exchange multipoles.
TE nonadditivity To better appreciate the behavior of this term let us consider two extreme configurations, an equilateral triangle cluster and a collinear arrangement. The exchange repulsion is often interpreted as a distortion brought about by the theoretical process of orthogonalizing occupied orbitals of interacting monomers. If two monomers are already orthogonalized, bringing around a third monomer in a perpendicular approach already requires a somewhat weaker orthogonalizing. The effective exchange repulsion in the trimer will thus be less than three times the exchange repulsion of a dimer. The nonadditive effect is attractive and may be modeled as Eexch= -A exp(-txrl2)exp(-13rl 3)exp(-Tr23)
(18)
An opposite situation arises for a collinear form. Orthogonalizing two monomers "gets in the way" of orthogonalizing the third monomer. Instead of cooperating, the monomers now compete. Eventually, the net repulsion is larger than the sum of all three pair repulsions. The nonadditive effect is repulsive.
Dispersion nonadditivity The mechanism of dispersion nonadditivity was proposed over 50 years ago by Axilrod and Teller [76] and independently by Muto [77]. It is referred to as the correlation of three instantaneous dipoles. To better appreciate the behavior of this term, let us consider the same two extreme configurations of a trimer as for the TE nonadditivity described earlier. In the equilateral triangle the three monomers cooperate in correlating with each other; i.e. when a third monomer gets close, it sees the other two conveniently "pre-correlated." In contrast, for the collinear approach of a third monomer this "pre-correlation" takes place in the wrong direction. Since pair dispersion interaction is attractive, the nonadditivity is repulsive for the equilateral trimer and attractive in the collinear form. The simplest model for the three-body dispersion energy is provided by the well-known triple-dipole expression VDDD = 3 ZDDD 1 + 3 cos01 cos02 cos03 R312R33 R31
(19)
where Rij describe the sides and 0 i the angles of a triangle formed by three atoms, while ZDDD
691 describes a triple-dipole dispersion coefficient. A more accurate representation should account for higher terms (like DDQ etc.) as well as for the overlap effects.
Ar2HC1 The structure of this complex is shown in Fig. 9. Ab initio calculations of the three-body potential determined the relative importance of each term. They showed [78,79] that the total three-body interaction is very anisotropic with respect to the in-plane and out-of-plane rotations of HC1 within the cluster (see Fig. 10). It is instructive to have a closer look at the composition of this three-body term. The dispersion component is only moderately anisotropic, and the induction component is slightly more anisotropic than dispersion. The former may be reliably approximated by the third-order DDD term. The latter is dominated by the interaction of multipoles induced on the Ar atoms by the HC1 dipole. The most anisotropic is the exchange nonadditivity. It may be split into two physically different components: the "pure" exchange effect, TE, and the electrostatic interaction of the Ar 2 "exchange quadrupole" with the HC1 dipole, SE. It is important to notice that both parts of the total exchange nonadditivity show the opposite behavior. As seen in Fig. 10 the SE term is strongly repulsive for the O=0 ~ geometry, slightly attractive for O=0 ~ angle, and again becomes slightly attractive for the O=180 ~ The TE contribution is attractive for O=0 ~ and cancels a large portion of the SE contribution. This different behavior is not surprising as SE follows the pattern of the dipole-quadrupole electrostatic interaction while TE may be related to the mutual orthogonalization of monomers discussed earlier. Overall, approximating the total exchange nonadditivity by only its SE contribution, leads to too anisotropic a model.
Ar2HF Ab initio calculations of the three-body potential of Ar2HF by Cybulski et al. [79] reveal considerable differences between this system and Ar2HC1 [78] (see Table 2). First of all, because of a much closer approach of HF to the center of mass of Ar 2, the HL-exchange nonadditivity is larger and much more anisotropic in Ar2HF. The difference between the maximal and minimal value of the interaction energy at a given distance (amplitude) amounts to about 42 ~tEh for Ar2HF vs. 22 ~tEh for Ar2HC1. Another difference is the induction nonadditivity; it is larger (since the dipole of HF is larger than that of HC1), and also much more anisotropic (the amplitude of 38 ].tEh for Ar2HF vs. 15 ~tEh for Ar2HC1). The dispersion nonadditivity is smaller in magnitude and slightly less anisotropic, mostly due to the smaller polarizability of HF. The decomposition of the HL-exchange effect into its SE and TE components is shown in Fig. 10. It is qualitatively similar to the Ar2HC1 case. A comparison of higher order effects also proves instructive: Ar2HF has a significant contribution from the induction-dispersion coupling, whereas Ar2HC1, from the exchangedispersion effect. In conclusion, the three-body effect in these systems represents a somewhat different blend, with the induction-type components being much more important for Ar2HF. The first analytical three-body potential for Ar2HC1 was proposed by Hutson and collaborators [75] on the basis of semiempirical considerations. It consisted of three terms: exchange, induction, and dispersion nonadditivities. The dispersion nonadditivity was represented by Eq.(19), and the induction nonadditivity as an interaction of multipoles induced
692
'
601
A)
E, ~tEh/
'
'
' ' ' A r2HCl
'
'
T'-- '1 ota
'
'
40
'
'
'
'
'
exchSE I
?.~,'",z~'t
20/,
,
b~
b--:.
o~_~--+_._+
-~ ~ _
,-
. "~
~
e
"--.- -- "~ ?~"
-20 ,[-
def disp
-'~.~ " , ~ ~ -~ _+ ~ - - - ~ ~ ~" ---.--" ~...
I
I --%-.._ .--~-. I -~ _~;-" - \
,
-
exch
.~:31
%..__..~
i-
'>'...t, ~
.~""~I~"
,i rT',Cl~ ,
,
- 180
70 BE h
,
,
,
- 120
'
'
B)
,
,
e x c , - , ~ ,
-60
I
'
'
0
I
'
'
,
'
~
50
'
,
,
,
120
I
'
'
I
180
'
'
exchSE
~/'~
+'~~..~ dee
30 -
,
Total
~
r
,
60
O, deg
1
A r2HF
,
exch
L'
n.~_ /
10
~- "+ * -~'~'.~
/7 f'" ~-tt
-10
--=-*-- +----,-- ---'U-. o ,
- 180
,
I
- 120
,
,
d isp
'.~'x,x~ .,.
_
t
.
" "*I
-60
,
exchTE ~
I
0
,
,r
-.-- *--' --- -,-----*--
-*" ,
I
60
,
+
1
120
,
,
1
I
180
O, deg Figure 10. The dependence of the three-body components upon the in-plane rotation of: A) HC1 in the Ar2HCI cluster, B) HF in the Ar2HF cluster. The following abreviations have been used: "exch" - eexch,rL."def" _ z-X~defA'~SCF;"disp" - "disp,~(30)""Total" - AEMP3; SE and TE denote single-exchange and triple exchange respectively.
693
by HC1 on two Ar atoms. The SE exchange nonadditivity was modeled by the interaction between the exchange-induced quadrupole moment on Ar 2 and the permanent moments of the HC1 molecule. Later, Ernesti and Hutson [80] introduced a number of modifications of the Ar2HF model by recognizing other mechanisms. For example, the interaction in the Ar dimer subsystem involves the dispersion effect, which induces a quadrupole moment of opposite sign to the exchange-quadrupole one. In addition, the induction effects in the trimer, due to the presence of HF rather than HC1, required a more accurate treatment. The nonadditive potential obtained in Ref. [80] was successfully employed to predict the change in the red shift of the HF stretch by Bali6 and collaborators [81].
Ar2C02 In this cluster the three-body effect was detected via the observation of the asymmetric stretching frequency of CO 2 by Sperhac et al. [82]. Recent ab initio calculations confirmed the experimental predictions [83]. The exciting aspect of this cluster is that the nonadditive effect on the stretching frequency may be obtained directly with a very good accuracy. The reason is the well defined structure of the Ar2CO 2 cluster, shown in Fig. 11
O Ar ~
,-.~
()
., ,...
" "
II
""t~F
Figure 11. The T-shaped configuration of the Ar2CO 2 cluster. As long as the two Ar atoms are held in equivalent equatorial positions, the interaction with each of them should, in the pairwise additive approximation, result in the same incremental shifts of the asymmetric stretch of CO 2. In reality, a minute nonadditivity of shifts amounting to 0.042 cm-1 was observed by Sperhac et al. when the second Ar atom was added. Rak et al. [83] used a one dimensional model to calculate the ab initio estimate of three basic nonadditive components in the v=0 and v=l levels of this stretch (see Table 3). The three-body dispersion interaction affects the v=0 and v=l levels more than the other two terms. The effect of induction nonadditivity is about an order of magnitude smaller. Even smaller is the exchange nonadditivity. Interestingly, its SE and TE constituents are large, but quite accurately cancel one another. These results are sensible. Ar2CO 2 in the T-shaped configuration is predominantly dispersion bound and the induction effect should play a secondary role only. The signs of TE and SE terms are also easy to predict by means of the models outlined at the beginning of this Section: exchange-quadrupole interaction and distortion due to orthogonalizing, respectively. A different picture emerges when we analyse the effects of the three-body terms upon the
694 Table 2 Comparison of nonadditive terms in Ar2HC1 and Ar2HF. All values are in laEh. Three-body term
Ar2HC1
HL
8exch
Ar2HF
4.8
HL 8exch,SE
24.9
27.2 a)
44 .7 b)
-23.5 a)
- 19.8 b)
(30) ind,r
I 1.5 16.5
30.8 36 1
AEscF
16.3
55.7
AE(2)
8.3
-10.0
~(3o) disp
30.6
21.5
HL 8exch,TE AESCF def
AE(3) 26.3 AEMP3 50.9 a)Approximate calculation from Ref. [79]. b)Accurate calculation from Ref. [72].
20.2 65.9
Table 3 The effect of the three-body contributions upon the frequency shift of the CO 2 antisymmetric stretch (in ~tEh).
Three-body term
<E>v=o
<E>v=l
<E>v=l-<E>v=O
8exc hIlL HL SE Eexch,
0.774 10.849
0.629 10.864
-0.145 0.015
HL E Eexch,T
10.077
- 10.234
-0.158
(30) ind,r
4.376
4.815
0.438
AF~def SCF AFt(2)
1.542 2.321
1.970 2.254
0.428 -0.067
8(30) disp
27.170
27.069
-0.101
Sum:
HE +1~(30) + ..(30) ind,r e'disp Sum: EeI'~xch+AEdef SCF +AE(2)+ ~disp ..(30) s
Experiment [82]
0.192 0.125 0.192
695 frequency shift, i.e. the difference of v=l and v=0. Although the dispersion term is the largest, it differs very little for the the v=0 and v=l states states, and so do the SE and TE terms. The three-body induction term strongly differentiates between the two states, and thus has the most effect upon the shift. The dramatic change in induction effect can be rationalized in terms of the appearance of the dipole moment when CO 2 is deformed along the asymmetric stretching coordinate.
5.2 Water trimer: induction nonadditivity In the water trimer induction nonadditivity provides a dominant contribution, which effectively overshadows all the other terms. Its mechanism is simple. For instance, in a cyclic water trimer the multipoles of A inductively alter the multipoles at B, which, in turn, inductively alter the multipoles at C, which then alter those on A, and so on, until the selfconsistency is reached. Various formulations of this simple model were implemented in the simulations since the 1970s [84-87,63,64,50]. To include the many-body induction effects of point charges interacting with a set of polarizable atomic centers the following classical electrostatics equation is solved iteratively E
1 Fo poI'--2E ~l'i i i
(20)
where [.I,i is the induced dipole on center i and F ~ 1 is the electric field at center i arising from all other fixed charges in the cluster. The induced dipole moment ILl,i and the total electric field at the polarizable center j are evaluated self-consistently from the following expressions:
~l,i=o~iF i
(21)
and
o Fi=Fi + E TijlLtj j~-i
(22)
where ~i is the polarizability of the center (e.g. atom) i and Tij is the dipole-dipole interaction tensor closely related to the interaction matrix desribed earlier, cf. Eq.(10). Eqs.(20-22) SCF neglect the exchange effects arising in tA. . x127 X - , d e f . Yet, they fairly well approximate the latter over a wide range of configurations [45]. This approach has been successfully incorporated into the nonadditive molecular dynamics simulations [88]. The neglect of other nonadditive effects exchange and dispersion - appears to be justified for low energy configurations. They become important, however, for repulsive geometries, e.g. at barriers separating superimposable structures [45]. An ab initio model of nonadditive effects in water, which also includes SE and TE components of the exchange repulsion component, the dispersion term and perhaps some other, secondary terms should be investigated in the near future. So far the effect of nonadditivity in water has been studied in the context of various structural properties, vibrationally averaged structures, O-H frequency shifts [89], zero-point energies, rotational constants, cluster predissociation dynamics and tunneling splittings [58].
696 The importance of three-body effects in the determination of macroscopic properties has also been studied. Recently, nonadditive molecular dynamics simulation of water and organic liquids has been performed [86].
6. SUMMARY The nature of Van der Waals binding may be described in terms of four basic types of interactions: electrostatic, induction, dispersion and exchange. These interactions are useful to classify and understand the physical origin of intermolecular potentials, and the probable structures of Van der Waals complexes. In this context, they play a similar role to the concepts of covalent and ionic binding in strong chemical interactions. The fundamental interaction energy constituents are included in the rigorous quantum theory of weak molecular clusters. They can be calculated with a desired accuracy, and represented with analytical forms that are related to a variety of simple and physically sensible models. Today, both the ab initio calculations and the potential modeling can be performed for small- and medium-sized molecules, and provide reliable intermolecular PESs for a wide range of mutual intermolecular orientations. Future calculations should also include the intramonomer degrees of freedom, and incorporate them into the final potential form. Very little has been done in this area so far but the tools are already available. All these advances will be accompanied by simulations of the dynamics and vibrational averaging, with the ultimate goal of bridging the chasm between our understanding of the electronic structure of atoms and molecules, and macroscopic character of matter.
8. A C K N O W L E D M E N T S We thank Dr. Piotr Cieplak for reading and commenting on the manuscript, and Professor Richard Bader and Dr. Todd Keith for providing us codes, which draw the Laplacian of the electron density. Gaussian 92 codes [90] were used for electronic structure calculations. Support by KBN through the Department of Chemistry, University of Warsaw, within the Project BST/532/23/97 and by the National Science Foundation (Grant no. CHE-9527099) are gratefully acknowledged. The Interdisciplinary Center of Modeling, University of Warsaw is acknowledged for the computational grant.
REFERENCES 1. L. Pauling, The Nature of The Chemical Bond and the Structure of Molecules and Crystals; An Introduction to Modern Structural Chemistry, 3rd ed., Cornell University Press, Ithaca, N.Y., 1960. 2. L. Pauling, General Chemistry, 3rd ed., W.H. Freeman, San Francisco, 1970. 3. L. Pauling in: The Chemical Bond. Structure and Dynamics, A. Zawail (ed.) Academic Press, San Diego, 1992. 4. J.O. Hirschfelder, C.F. Curtiss, and R.B. Bird, Molecular Theory of Gases and Liquids, Wiley, New York, 1954.
697
5. F. London, Trans. Faraday Soc. 33 (1937) 8. 6. A.D. Buckingham in: Intermolecular Interactions: From Diatomics to Biopolymers, B. Pullman (ed.), Wiley, New York, 1978. 7. P. Piecuch, in: Molecules in Physics, Chemistry and Biology, P. Maruani (ed.), Kluwer, Dordrecht, 1988, vol. 2, p. 417. 8. H. Margenau and N.R. Kestener, Theory of Intermolecular Forces, Pergamon, Oxford, 1971. 9. J.O Hirschfelder and W.J. Meath, Adv. Chem. Phys. 12 (1976) 3. 10. J.N. Murrell, in: Rare-Gas Solids, M.L. Klein and J.A. Venables (eds.), Academic Press, London, 1976, p. 177. 11. B.Jeziorski and W. Kotos, in: Molecular Interactions, H. Ratajczak and W.J. OrvilleThomas (eds.), Wiley,New York, 1982, vol. 3, p. 1. 12. A.D. Buckingham and P.W. Fowler, J. Chem. Phys. 79 (1983) 6426; Can. J. Chem. 63 (1985) 1985. 13. C.E. Dykstra, Chem. Rev. 93 (1993) 2339. 14. A.D. Buckingham, P.W. Fowler, and J.M. Hutson, Chem. Rev. 88 (1988) 963. 15. A. van der Avoird, P.E.S. Wormer, and R. Moszyfiski, Chem. Rev., 94 (1994) 1931. 16. C. Bissonnette, K.G. Crowell, R.J. Le Roy, R.J. Wheatley, and W.J. Meath, J. Chem. Phys. 105 (1996) 2639. 17. D.M. Chipman and J.O Hirschfelder, J. Chem. Phys. 59 (1973) 2838. 18. B. Jeziorski, R. Moszyriski, and K. Szalewicz, Chem. Rev., 94 (1994) 1887. 19. G.Chalasifiski and M.M. Szcz~niak, Chem. Rev., 94 (1994) 1723. 20. I.C. Hayes and A.J. Stone Mol. Phys. 53 (1984) 83; ibid. 53 (1984) 69. 21. S.F. Boys and F. Bernardi, Mol. Phys. 19 (1970) 553. 22. G. Chatasifiski and M. Gutowski, Chem. Rev. 88 (1988) 943. 23. F.B. van Duijneveldt, J.G.C.M. van Duijneveldt-van deRijdt, and J.H. van Lenthe Chem. Rev. 94 (1994) 1873. 24. M. Gutowski, G.Chatasifiski and M.M. Szcz~niak, Chem. Phys. Lett. 241 (1995) 140. 25. R.J. Bartlett and J.F. Stanton in: Reviews of Computational Chemistry, K.B. Lipkowitz and D.B. Boyd (eds), VCH Publishers, New York, 1994, Vol. 5 p. 65. 26. J.O. Hirschfelder, Chem. Phys. Lett. 1 (1967) 325. 27. K. Szalewicz and B. Jeziorski, Mol. Phys. 38 (1979) 191. 28. G.Chatasiriski and M.M. Szcz~niak, Mol. Phys. 63 (1988) 205. 29. A.J. Stone and C.-S. Tong, J. Comput. Chem. 15 (1994) 1377. 30. C. Amoviolli and R. McWeeny, J. Mol. Structure (Theochem), 227 (1991) 1. 31. B. Kukawska-Tarnawska, G. Chatasifiski, and M.M. Szcz~niak, J. Mol. Structure (Theochem), 297 (1993) 313. 32. B. Kukawska-Tarnawska, G. Chalasifiski, and K. Olszewski, J. Chem. Phys., 10t (1994) 4964. 33. R.F. Bader, Atoms in Molecules, Clarendon Press, Oxford, 1994. 34. J.C. Slater and J.G. Kirkwood, Phys. Rev. 37 (1931) 682. 35. L. Pauling and J.Y. Beach, Phys. Rev. 47 (1935) 686. 36. R. Burcl, G. Chatasifiski, R. Bukowski, and M.M. Szcz~niak, J. Chem. Phys. 103, (1995) 1498. 37. M. Gutowski, J. Verbeek, J.H. van Lenthe, and G. Chatasifiski, Chem. Phys. 111 (1987) 271.
698 38 (a) F.-M. Tao, and Y.-K. Pan, J. Chem. Phys. 97 (1992) 4989. (b) F.-M. Tao, and Y.K. Pan, Chem. Phys. Lett. 194 (1992) 162. 39. M. Gutowski and L. Piela, Mol. Phys. 64 (1988) 337. 40. K. Morokuma and K. Kitaura, in: Molecular Interactions, H. Ratajczak and WJ. OrvilleThomas (eds.), Wiley, New York, 1982, vol. 1, p. 21. 41. L.A. Curtiss, A.J. Pochatko, A.E. Reed, and F. Weinhold, J. Chem. Phys. 82 (1985) 6833; E.D. Gladening and A. Streitwieser, J. Chem. Phys. 100 (1994) 2900. 42. A.J Stone, Chem. Phys. Lett., 211 (1993) 101. 43. P.J. Marshall, M.M. Szcz~niak, J. Sadlej, G. Chatasifiski, M.A. ter Horst, C.J. Jameson, J. Chem. Phys. 104 (1996) 6569. 44. R.F. Bader and T.A. Keith, J. Chem. Phys. 99 (1993) 3685. 45. G. Chatasifiski, M.M. Szcz~niak, P. Cieplak, and S. Scheiner, J. Chem. Phys., 94 (1991) 2873. 46. K.T. Tang, J.P. Toennies, J. Chem. Phys., 95 (1992) 5918. 47. M.A. ter Horst and CA. Jameson, J. Chem. Phys. 105 (1996) 6787. 48. E.J. Bohac, M.D. Marshall, and R.E. Miller, J. Chem. Phys. 97 (1992) 4890. 49. O. Matsuoka, E. Clementi, and M. Yoshimine, J. Chem. Phys. 64 (1976) 1351. 50. U. Niesar, G. Corongiu, E. Clementi, G.R. Keller, and D.K. Bhattacharya, J. Phys. Chem. 94 (1990) 7949. 51. C. Millot, AJ. Stone, Mol. Phys. 77 (1992) 439. 52. A.J. Stone, The theory of intermolecular interactions, Clarendon Press, Oxford 1996. 53. M.M. Szcz~gniak, Rd. Brenstein, S.M. Cybulski, and S. Scheiner, J. Phys. Chem. 94 (1990) 1781. 54. A.J. Stone in: Theoretical Models of Chemical Bonding, Z.B. Maksi6 (ed.), Springer Verlag, Berlin, 1991, Vol. 4, p. 103. 55. W. Rijks and P.E.S. Wormer, J. Chem. Phys. 90 (1989) 6507; ibid 92 (1990) 5754. 56. S.C. Althorpe, D.C. Clary, J.Chem.Phys., 101 (1994) 3603. 57. J.K. Gregory, D.C. Clary, J.Chem.Phys. 101 (1995) 7817. 58. J.K. Gregory, D.C. Clary, J.Chem.Phys. 103 (1995) 8924. 59. J.K. Gregory, D.C. Clary, J.Chem.Phys. 105 (1996) 6626. 60. W.J. Meath and M. Koulis, J. Mol. Structure (Theochem) 226 1 (1991). 61. G. Dotelli and L. Jansen, Physica, A234 (1996) 151. 62. V. Lotrich and K. Szalewicz, Chem. Phys. 106 (1997) 9688. 63. P. Cieplak, P.A. KoUman, and T.P. Lybrand, T.P., J. Chem. Phys., 92 (1990) 6755. 64. P. Cieplak, and P. Kollman, J. Chem. Phys. 92 (1990) 6761; P. Cieplak, T.P. Lybrand, and P. Kollman, J. Chem. Phys. 86 (1987) 6393. 65. L. Perera and M.L. Berkowitz, M.L.J. Chem. Phys. 100 (1994) 3085. 66. W. Kotos, F. Nieves, and O. Novaro, Chem. Phys. Lett. 41 (1976) 431. 67. J. Higgins, C. Callegari, J. Reho, F. Stienkemeier, W.E. Ernst, K.K. Lehmann, M. Gutowski, and G. Scoles, Science 273 (1996) 629. 68. M Wilson, and P.A. Madden, J. Phys. Condensed Matter, 6 (1994) 159. 69. R. Moszyriski, P.E.S. Wormer, B. Jeziorski, A. van der Avoird, A. J. Chem. Phys., 103 (1995) 8058. 70. V. Lotrich and K. Szalewicz, J. Chem. Phys. 106 (1997) 9668. 71. M.M. Szczg~niak and G. Chatasifiski, in: Molecular Interactions, S. Scheiner (ed.), Wiley, Chichester, 1997, p.45.
699 72. G. Chatasiriski, J. Rak, M.M. Szcz~niak, and S.M. Cybulski, J. Chem. Phys., 106 (1997) 3301. 73. M.J. Elrod and R.J. Saykally, Chem.Rev., 94 (1994) 1975. 74. L. Jansen, Adv. Quantum Chem. 2 (1965) 119. 75. A.R. Cooper and J.M. Hutson J. Chem. Phys., 98 (1993) 5337. 76. B.M. Axilrod, E. Teller, J. Chem. Phys. 11 (1943) 299. 77. Y. Muto, Proc. Phys. Math. Soc. Jpn. 17 (1943) 629. 78. M.M. Szcz~niak, G. Chatasiliski, and P. Piecuch, J. Chem. Phys., 99 (1993) 6732. 79. S.M. Cybulski, M.M. Szcz~niak, and G. Chatasiriski, J. Chem. Phys., 101 (1994) 10708. 80. A. Emesti and J.M. Hutson Phys. Rev. A, 51 (1995) 239. 81. P. Niyaz, Z. Ba~,i6, J.W. Moskowitz, and K.E. Schmidt, Chem. Phys. Lett. 252 (1996) 23; Z. Ba~,i~, American Conference on Theoretical Chemistry, 1996. 82. J.M. Sperhac, M.J. Weida, D.J. Nesbitt, J. Chem. Phys. 104 (1996) 2202. 83. J. Rak, M.M. Szcz~niak, G, Chatasiriski, and S.M. Cybulski, J. Chem. Phys. 106 (1997) 3301. 84. F.H. Sillinger, C.W. David, J. Chem. Phys. 69 (1978) 1473. 85. P. Barnes, J.L. Finney, J.D. Nicholas, and, J.E Quinn, Nature 282 (1979) 459. 86. J.A. Rullman and P.T. van Duijnen, Mol. Phys. 63 (1988) 451. 87. M. Sprik and M.L. Klein, J. Chem. Phys. 89 (1988) 7556. 88. J.W. Caldwell, L.X. Dang, and P.A. Kollman, J. Am. Chem. Soc., 112 (1991) 9144; J.W. Caldwell and P.A. Kollman, J. Phys. Chem. 99 (1995) 6208. 89. J.G.C.M. van Duijneveldt-van de Rijdt and F.B. van Duijneveldt, Chem. Phys. 175 (1993) 271. 90. M.J. Frisch, G.W. Trucks, M. Head-Gordon, P.M.W. Gill, M.W. Wong, J.B. Foresman, B.G. Johnson, H.B. Schlegel, M.A. Robb, E.S. Replogle, R. Gomperts, J.L. Andres, K. Raghavachari, J.S. Binkley, C. Gonzalez, R.L. Martin, D.J. Fox, D.J. Defrees, J. Baker, J.J.P. Stewart, and J.A. Pople, Gaussian 92, Gaussian, Inc., Pittsburgh PA, 1992.
This Page Intentionally Left Blank
Z.B. Maksi6 and W.J. Orville-Thomas (Editors) Pauling's Legacy: Modem Modelling of the Chemical Bond
701
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
The nature of the chemical bond in metals, alloys, and intermetallic compounds according to Linus Pauling Zelek S. Herman* Herman Scientific Consulting, 521 Del Medio Avenue, #107, Mountain View, CA 94040 USA
ABSTRACT A review of the unsynchronized-resonating-covalent-bond theory of metals in presented. Key concepts, such as unsynchronous resonance, hypoelectronic elements, buffer elements, and hyperelectronic elements, are discussed in detail. Application of the theory is discussed for such things as the atomic volume of the constituents in alloys, the structure of boron, and superconductivity. These ideas represent Linus Pauling's understanding of the nature of the chemical bond in metals, alloys, and intermetallic compounds.
1
INTRODUCTIONt
A metal is a substance that possesses several of the following properties: It is a solid at room temperature; it is opaque to light, and when polished it is a good reflector of light, having a peculiar appearance, called metallic luster; it is a good or fairly good conductor of heat and electricity, being malleable (capable of being hammered into thin sheets) and ductile (capable of being drawn into wires). Gold, for example, is so malleable that it can be hammered into foil so thin that it is transparent to visible light. Among elemental metals, mercury is exceptional in being liquid at room temperature although the melting point of gallium is 29.8 ~ and it is a liquid from this temperature until its boiling point, 2905 ~ *Editor's note: For over 14 years, Dr. Herman was Professor Linus Pauling's collaborator at the Linus Pauling Institute of Science and Medicine in Palo Alto, California. Dr. Herman and Pauling's long-time assistant, Dorothy Bruce Munro, are the co-compilers of The Publications of Professor Linus Pauling, which is published on the Internet at http://charon.lpi.org/-zeke. tThis section is based upon the introduction to a book bearing the title of this chapter that was begun by Linus Pauling several years before his death but never completed.
702 Some elemental metals and many intermetallic compounds are brittle, not malleable or ductile. Borderline substances, showing metallic properties to a decreased extent, are called metalloids or semiconductors. Probably the best criterion for distinguishing a metal and a metalloid or semiconductor is the temperature coefficient of thermal and electrical conductivity. With increase in temperature, the thermal and electrical conductivity of a metal decreases, whereas that of a metalloid or semiconductor increases. Six elemental metals are mentioned in the Old Testament (e.g., Numbers 31:22): gold (Hebrew: zahav), silver (kesef, also the Biblical Hebrew word for money), copper (nechosheth, often translated into English as brass, which is an alloy of copper and zinc and may, in fact, not have been known in the time of Moses, or bronze, an alloy of copper and tin, which was known in the time of Moses), iron (barzel), tin (b'deel), and lead (ofereth). The ancient Greeks also recognized mercury. About seventy-five elemental metals are now known. Many metals become superconducting at low temperatures, that is, their electrical resistance is essentially zero. However, copper, silver, and gold, which are among the best electrical conductors, do not become superconducting at any known temperature. Superconductors are generally classified as being one of two types: Type I superconductors completely repel an external magnetic field at temperatures below the superconducting transition temperature (a phenomenon known as the complete Meissner effect) whereas type II superconductors do not completely repel an external magnetic field at temperatures below the superconducting transition temperature but instead go through a so-called vortex state. Two forms of elemental tin are known. One, gray tin, which has a diamond structure with tetrahedral bonding, is a metalloid. White tin, which has a body-centered tetragonal structure, is metallic. Experimental measurements indicate that the two forms of tin lie very close to each other in energy, with gray tin being more stable at lower temperature. The phase transition between the two forms occurs at 13 ~ at temperatures below 18 ~ white tin slowly changes to gray tin. At very low temperatures (< -40 ~ their conversion is sometimes so rapid that metallic tin objects fall into a powder of gray tin; this phenomenon has been called the "tin pest". Nevertheless, white tin becomes a type I superconductor, with the value of the superconducting transition temperature equal to 3.722 K. Another interesting property of white tin is that when rods of it are bent, it makes a distinctive sound, known as the "tin cry". This is due to the breaking of the microcrystals in the highly crystalline structure of white tin. An alloy is a metallic substance formed by melting together or otherwise mixing two or more elements, at least one of which is a metal. Many alloys are intermetallic compounds, with atoms of different elements in a well-defined ratio. Often the phase exists with a range of compositions. Such a phase may be called a solid solution or crystalline solution. Most metals are mutually soluble in the liquid state, showing a single liquid phase over the entire composition range, even when the melting points of the individual elemental metals are very different. There are some exceptions, however. Two liquid phases are observed for lead and zinc, lead and gallium, lead and iron, silver and nickel, silver and
703 chromium, and some other binary alloys. Presumably the factors that lead to the formation of two liquid phases are that the bonds between unlike atoms are not stronger than the average of bonds between the like atoms, and that the structures of the two liquids are sufficiently different as to make it difficult for the molten alloy to assume an intermediate structure. The phase diagrams for the binary-metal systems show great variety [1]. One extreme is illustrated by Ag-Au, for which there are found a complete series of solid solutions with the cubic close-packed structure, and solidus and liquidus curves extending smoothly between the melting points 961.5 ~ for Ag and 1064.4 ~ for Au. Another simple system is Mg-Sn, with three phases at room temperature: nearly pure Mg and nearly pure Sn, and a well-defined compound, Mg2Sn. This compound has the fluoride structure, and its composition might be considered the expected one for bivalent magnesium and quadrivalent tin. Another system in which well-defined compounds occur is Ag-Sr. Here there are four intermediate phases, AgsSr (781 ~ AgsSr3 (760 ~ AgSr (.-~ 680 ~ and Ag2Sr3 (665 ~ It is not possible to interpret these formulas in terms of the usual valences of the elements, in contrast to the situation for inorganic compounds of metals and nonmetals. Another example of the lack of correlation with the periodic table is provided by compounds of the alkali metals with cadmium. In the Li-Cd system there are the compounds LiCd (549 ~ LiCda (370 ~ and Li3Cd (272 ~ Two compounds, NaCd~ (384 ~ and Na2Cdll (364 ~ occur in the Na-Cd system, and in the other three systems only one compound, KCdI3, RbCdi3, and CsCdi3, is found [2]. Compounds formed by the alkali metals are rather similar.
2
QUANTUM MECHANICS OF METALS
AND THE NATURE
The quantum mechanical treatment of molecules can be essentially classified into two main types. The first, molecular orbital theory, as developed by Mulliken, Slater, Pauling, and many others, has enjoyed great success owing to its relative ease of computational implementation. Most people do not associate the name of Linus Pauling with molecular orbital theory. However, Linus Pauling, in fact, in a paper with the title "The Application of the Quantum Mechanics to the Structure of the Hydrogen Molecule and Hydrogen Molecule-ion and to Related Problems," published in the June 1928 issue of Chemical Reviews, introduced the notion that the Pauli exclusion principle can be satisfied by constructing a determinant of spin-orbit functions [3,4]. This determinant is now known as the "Slater determinant" and plays a central role in molecular orbital theory. The alternative to molecular orbital theory, valence-bond theory, as developed by Heitler, London, Slater, and Pauling, is not as easily amenable to computational implementation. Consequently, it has not been employed to any great extent in the detailed computational investigation of molecular systems. Nevertheless, scientists, when visualizing the chemical bond in
704 molecules, still think of the conceptually simple framework provided by the valence-bond method. Furthermore, in a note published in 1932, John C. Slater pointed out that, when each method is refined, the molecular-orbital treatment of molecules and the valence-bond treatment of molecules each converge to the same result [5]. An analagous situation exists for the quantum mechanical treatment of metals, alloys, and intermetallic compounds. Band theory, the commonly employed quantum mechanical theory of the electronic structure of metals, alloys, and intermetallic compounds, started with the discussion by Pauli of the temperature-independent paramagnetism of the alkali metals [6]. This theory was further developed by Sommerfeld, Houston, Eckert, Bloch, Frenkel, Slater, Mott, Jones, Wilson, Bethe, Seitz, Kittel, and many others to the point where band theory provides a good understanding of many of the properties of metals in terms of a calculational method that yields numerical results in good agreement with experiment [7-20]. In the band theory of metals, the outer electrons are treated as nearly free, so that they can move through the system under investigation in the way described by Bloch functions, which account for the periodicity of the system, and interact with the atomic ions arranged in closest packing or some other structure. The local-density-approximation [21] calculations of Moruzzi, Janak, and Williams [22] concerning the electronic properties of metals are characteristic of the considerable success achieved by band theory in its numerical application to the study of the physical properties of particular metals. Simply put, in band theory a substance will display metallic electrical conductivity if the valence band is not completely filled by the valence electrons (and not separated by a large gap from the unoccupied conduction band) since the application of an electric field to the substance will cause the excitation of electrons into states in the conduction band, so that they move through the system. Nevertheless, similar to the situation for the molecular orbital theory of compounds, a number of theoretical and practical problems remain concerning the application of band theory, for example, possible linear dependence of the basis set, lack of convergence, and the need to take explicit account of electron correlation. More importantly, band theory does not provide a conceptually simple way to visualize the nature of the chemical bond in metals. An alternative to the band theory is the unsynchronized-resonating-covalent-bondtheory of metals, alloys, and intermetallic compounds developed by Linus Pauling and some of his coworkers, initially from an empirical investigation of the saturation magnetic moment of the first-row transition metal alloys [23] and later derived theoretically on statistical grounds [24-38]. The basic premise of the unsynchronized-resonating-covaJentbond-theory of metals, alloys, and intermetallic compounds is that the electrons in such systems, as in other substances, are bound to atoms and occupy atomic orbitals. The outer electrons may interact in such a way as to form covalent bonds between the atoms. However, in a metal, alloy, or intermetallic compound, unlike the situation for other substances, the number of bond positions is larger than the number of bonds, leading to resonance of the bonds among the alternate positions and resulting in electrical conductivity under the influence of an applied electric field. Electronic correlation is built into the theory. One of the salient features of the unsynchronized-resonating-covalent-bond-theory of metals, alloys, and intermetallic compounds is that, on average, 0.72 of an orbital per
705 atom must not be occupied either by a bonding electron or an unshared pair of electrons in order for the unsynchronized resonance that confers metallic properties on a substance to occur. This 0.72 of an orbital per atom has been given the appellation the metallic orbital. Moreover, a substance will display metallic character if it has a metallic orbital available for electrons to move through the substance under the influence of an applied electric field. One of the lasting practical results of treating metals in this model has been the tabulation of atomic radii and interatomic distances in metals [39-42]. Another interesting application of the unsynchronized-resonating-covalent-bond-theory of metal is its use in the elucidation of the to the structure and properties of elemental boron and the boranes [43].
3
THE METALLIC
ORBITAL
In 1938, Linus Pauling concluded from an investigation of the physical properties of the metals that, in a sequence such as K, Ca, Sc, Ti, V, Cr, the number of bonding electrons increases monotonically from 1 to 6, remains constant at 6 from Cr to Ni (except for Mn, which has an anomalous structure), and then begins to decrease. He published a curve, now known as the Slater-Pauling curve, of the saturation ferromagnetic moment per atom for the alloys containing chromium through copper in the first-row transition metals [23]. This curve, shown in Figure 1, indicates that the saturation magnetic moment rises to a value of ~ 2.4 Bohr magnetons for Fe-Co alloys, with the maximum value occurring at approximate composition Fe72Co28. Then this value decreases to zero for the alloy with approximate composition Ni44Cu~6. By assuming that each of the six bonds in these alloys requires an orbital, he concluded that, for the transition metals, only 8.28 of the nine 3d, 4s, and 4p valence orbitals are occupied by bonding electrons, ferromagnetic electrons, or electron pairs, and consequently, that 0.72 of an orbital per atom, on average, is without any apparent use. Earlier, in 1928, Heisenberg, had discussed the spin-polarization of covalent bonds as the mechanism of interatomic interaction leading to ferromagnetism [44]. Ten years later, in 1948, Pauling realized that this apparently unused atomic orbital has an important function [24, 27, 30]. Consider a metal consisting of N identical atoms M with valence v, that is, each atom forms v covalent bonds with adjacent atoms. Now, if the number of bond positions were greater than the number of bonds, then the bonds could resonate from one position to another only synchronously, with pairs of bonds changing positions simultaneously:
706
S
S
J
S
s
S
J J
b J J
J
J
S %
Cr
Mn
Fe
Co
Ni
I Cu
COMPOSITION
Figure 1. The Slater-Pauling curve displaying saturation ferromagnetic moment for the first-row transition metal alloys. This figure shows a comparison of experimental values (solid curves) and predicted values (dashed lines) of the saturation ferromagnetic moment per atom, in Bohr magnetons, for Fe--Co, Co-Ni, and Ni-Cu alloys. The short vertical lines indicate change in crystal structure. When the Zener contribution is taken into account, the slope of the dashedline from FeT2Co2s to Ni44Cu56 changes from -1, as shown, to -1.11.
707
M
M
M
M
M
M
M
M
S YNCHRONO US RESONANCE
However, the principle of approximate electroneutrality [45, 46] allows for the occurrence of M + and M-, with valences v - 1 and v + 1, respectively. Therefore, under the condition that there is an available orbital, unsynchronous resonance, involving the shift of a single covalent bond about an atom from one position to another, can then occur:
M
M
M
M-
M
M
M+
M
UNS YNCHR ONO US RESONANCE
In order for unsynchronous resonance to occur, the atoms M + and M ~ must have an unoccupied orbital available so that they can accept an additional bond. M- does not require such an unoccupied orbital because the electroneutrality principle rules out its accepting an additional bond, which would convert it to M 2-. Accordingly, the structural requirement for a system to possess metallic character is that the fraction of the atoms M + and M ~ have available an unoccupied orbital, called the metallic orbital. The average value of 0.72 orbital per atom for the metallic orbital, as deduced from the Slater-Pauling curve, implies that, with unsynchronous resonance of the covalent bonds, the metal consists of 28% M +, 44% M ~ and 28% M-. As will be seen in the statistical theory described in the following section, there exist far more unsynchronized resonating structures per atom than there are synchronized resonating structures. Associated with this increase in the number of resonating structures is an increase in stability for the system, with the increased resonance stabilization energy being approximately proportional to the number of additional resonating structures per atom for unsynchronous resonance, less 1. One is consequently led to conclude that unsynchronized resonance of the covalent bonds between the atoms in metallic systems occurs
708 because of the increased stability resulting from the large number of resonance structures associated with such resonance. Furthermore, the electrical conductivity of metals is a direct consequence of unsynchronous resonance in that the bonds resonate with electronic frequencies as the positive and negative charges pivot from atom to adjacent atom under an applied electric field, as illustrated schematically for the case of lithium metal in Figure 2. The frequency of such pivoting motion is determined by the resonance energy, which is comparable in magnitude to the bond energy and is only about one order of magnitude less than the binding energy of a valence electron to the atom. Linus Pauling first discussed this explanation of metallic conduction in 1948 [25, 27]. Other characteristic properties of metals, such as high thermal conductivity, high ductility and malleability, and negative temperature coeffficients of electrical conductivity can similarly be rationalized in terms of the unsynchronized resonance of covalent bonds. Thus, for example, the negative temperature coefficient of the electrical conductivity is a result of thermal agitation temporarily lengthening some bonds and shortening others. This process interferes somewhat with the resonance of the bonds, which does not occur as frequently between non-equivalent positions as between equivalent, or equienergetic, positions. This explanation for the negative temperature coefficient of metallic electrical conductivity agrees with the usual one, involving scattering of the electrons by phonons. Similarly, unsynchronous resonance accounts for the high ductility and malleability of metals because of the many more pivoting positions accompanying unsynchronous resonance than those accompanying synchronous resonance. Another important feature of the unsynchronized-resonating-covalent-bond theory of metals is that it accommodates the Zener theory of the interatomic interaction that results in ferromagnetism [47]. From Figure 1 it is seen that the saturation magnetic moment of the Fe-Co-Ni-Cu alloys displays nearly linear behavior, over part of its course, as a function of the number of outer electrons in the alloy, although the slope of the negative line is approximately 11% greater in magnitude than that expected from the electron number. It is here that the Zener theory comes into play, for Zener proposed that, under the influence of the atomic magnetic moments of the atoms, a pair of electrons involved in the covalent bond may be decoupled to produce two electrons with spins oriented either parallel or antiparallel to the ferromagnetic moment of the crystal and engaged in the formation of one-electron bonds between atoms. The Zener theory of uncoupling of electron spins by the atomic magnetic moments is similar to the uncoupling of electron spins by an external magnetic field that was proposed by Pauli to account for the temperature-independent paramagnetism of the alkali metals. If it is accepted that the saturation magnetic moment is increased in value by 11% over the atomic value by this uncoupling, then the observed value for iron, 2.22 Bohr magnetons, can be decomposed into a value of 2.00 Bohr magnetons for the atomic moment and 0.22 Bohr magnetons for the one-electron bonds, the other 5.78 of the total valence of 6 being electron-pair bonds. Figure 1 further shows that the value of the saturation magnetic moment becomes 0 at the composition Ni44Cu56, corresponding to 10.56 outer electrons
709
-IANODE
Li
Li
Li
Li
Li -
I
I
[
I
I
Li
Li
Li
Li
Li.
Li +
Li
Li
Li
Li
Li
Li
I
I
I
Li
Li
Li
Li-
Li
Li +
Li
Li
Li -
Li
Li
I
I
I
I
Li
Li
Li
Li
Li
Li
Li
Li
..... Li
Li
' Li
Li
Li
I
I
Li
Li -
Li -
Li
Li
Li
Li
Li
Li
Li
Li
I
Li
Li
, Li
CATHODE
Li + .... Li
Li +
"-Li Li +
Figure 2. Diagram illustrating motion in lithium metal of a negative charge (an electron) from the cathode to the anode by successive pivoting resonances of a covalent bond.
710 per atom. Of these, 6 are involved in covalent-bond formation, using 6 valence orbitals. This leaves 4.56 electrons per atom as electron pairs, occupying 2.28 orbitals. Thus, 8.28 orbitals of the 9 available are occupied, leaving 0.72 metallic orbital per atom to engage in unsynchronous resonance. Other ways of estimating the number of metallic orbitals per atom, such as the comparison of the observed interatomic distance in white tin and gray tin, or the oxidation numbers of the atoms in the superconducting copper-oxide materials [38], yield less accurate values of about 0.7 metallic orbital per atom.
4
THE DETAILED ANALYSIS OF THE STATIST I C A L T H E O R Y OF U N S Y N C H R O N I Z E D RESONANCE OF COVALENT BONDS
In a footnote to his 1949 paper entitled "A Resonating-Valence-Bond Theory of Metals and Intermetallic Compounds," Linus Pauling gave an example of a simple statistical treatment to derive the metallic orbital [27]. Nevertheless, it took him three and one-half decades to publish the detailed statistical treatment [34-36], which is given in the following. Let us consider a crystal composed of N identical atoms, each with covalence v and ligancy L. The number of ways of distributing the v N / 2 positions is given by:
W =
(LN/2)!
[(L - v) N/2]! [vN/2]!
'
(1)
if we exclude multiple occupancy. Using Stirling's approximation
t! = (2rt) 1/2 ( t / e ) t
(2)
,
then the number of ways w = W 1/N in which bonds are arranged around each atom is found to be: LL/2 w-
v,,/2 ( n -
v)CL-,,)/2
"
(3)
Let us further consider the number of resonance structures, u, in which n bonds are arranged about an atom with average valence v in the crystal. This number of resonance structures per atom is proportional to the probability given by the binomial distribution,
711
with proportionality constant w, or
u(L,v,n) = w
v '~ ( L - v) (L-'~) L! L i nl ( L - n)!
(4)
"
For the case of synchronous resonance, n = v, and the number of resonance structures per atom, us~.~, becomes v "/2 ( L - v) (L-v)/2 L] vsy.~h = LL/2 v! ( L - v)!
(5)
"
It is interesting to note that, for L = 4 and v = 2, eqn. (5) gives R ~n(3/2) for the residual entropy of ice; this value differs by only 1% from that given by calculations not involving the approximations made here [48]. We must now consider two classes of metals, hypoelectronic metals, and hyperelectronic metals [29]. A hypoelectronic metal is one composed of atoms in which the number of outer electrons is less than the number of outer orbitals, and a hyperelectronic metal is one composed of atoms in which the number of outer electrons is greater than the number of outer orbitals. For a metal composed of hypoelectronic atoms, the number of bonds n can take the values v - 1, v, and v + 1, corresponding to M +, M ~ and M-, respectively, and eqn. (4) yields the following expression for the number of unsynchronized resonance structures per atom:
Uhy~=
v ~'12 ( L - V ) (L-v)/2 L! LL/2vI(L-v)I
L-v
v
[L-v+l
[
+l+v+i]
]
"
(6)
A comparison of eqns. (5) and (6) reveals that the term in square brackets in eqn. (6) is the ratio of the number of unsynchronized resonance structures per atom to the number of synchronized resonance structures per atom for a hypoelectronic atom. Given the reasonable assumption that the energy corresponding to an unsynchronized resonance structure is the same order of magnitude as that for a synchronized resonance structure, the energy of a crystal composed of hypoelectronic atoms is lowered considerably via unsynchronized resonance. Therefore, one predicts that every element with an extra orbital to serve as the metallic orbital should be a metal. With a single possible exception, namely boron, which will be discussed in a succeeding section, this prediction is borne out. In Table 1 the number of unsynchronized resonance structures per atom for hypoelectronic metals with various values of the ligancy L and valence v are given. These are also shown in Figure 3, from which it is seen that a maximum in the number of unsynchronized resonance structures per atom for hypoelectronic metals occurs at v = L/2. As will be
712
Table 1. Number of unsynchronized resonance structures per atom as a function of valence v and ligancy L for hypoelectronic metals. L\ v 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 2.00 2.50 2.92 3.29 3.62 3.93 4.21 4.47 4.72 4.96 5.19 5.40 5.61 5.81 6.01
80
2
3
4
5
6
7
8
2.50 3.50 4.49 5.48 6.47 7.45 8.44 9.42 10.40 11.39 12.37 13.35 14.33 15.32
2.92 4.49 6.25 8.18 10.26 12.49 14.85 17.35 19.97 22.71 25.56 28.52 31.58
3.29 5.48 8.18 11.37 15.08 19.28 24.00 29.21 34.93 41.16 47.89 55.13
3.62 6.47 10.26 15.08 21.00 28.10 36.44 46.10 57.12 69.57 83.51
3.93 7.45 12.49 19.28 28.10 39.19 52.81 69.21 88.66 111.41
4.21 8.44 14.85 24.00 36.44 52.81 73.73 99.91 132.03
4.47 9.42 17.35 29.21 46.10 69.21 99.91 139.65
NUMBER' OF' -
j ' t r.. 0 U I~11i,4, I ~11lt,,,~ IT...,
STRUCTURES PER ATOM
60
I
!
!
i
t L=14 .
40
i
....
I~
20
.
I
'
z L=11 _
0
.
"r L=12
.~
0
.
2
4
6 8 VALENCE
!
L=.10
10
Figure 3. The number of resonance structures per atom for hypoelectronic metals ([:]) and for hyperelectronic metals (A) as a function of unit increase in valence v and ligancy L. Note that the maximum for each L occurs at v - L / 2 .
713
discussed later, this fact is of paramount importance in explaining the structure of such metals. For the case of hyperelectronic metals, that is, a substance composed of elements for which the number of outer electrons is greater than the number of outer orbitals, not including the metallic orbital, the statistical treatment is somewhat more complicated [34]. Let us first consider the valence v of such a metal. The neutral atoms M ~ form z bonds, and the ions M + and M- form z + 1 bonds. Denote the fractions of M +, M ~ and M - by y, x, and y, respectively. Then from eqn. (4) the ratio of the number of neutral atoms to the number of ions, x / y , is given by
x = u(L,v,z) = (L-v)(z+l) 2y u (n, v, z + 1) v (L - z)
(7) '
subject to the constraints
x + 2y-
1
(8)
and
v - x z + 2y (z + 1)
.
(9)
Eqns. (7)-(9) have one solution in the range z < v < z + 1:
v-
z + [ f ( f + 1)]1/2
_
_
f
(10)
,
where
f-[z(L-z)][L-2z-1]
.
(11)
Cu, Ag, and Au have z - 5 and L - 12, so that eqns. (10) and (11) yield the value v - 5.4965 for these metals. In the cases of Zn, Cd, and Hg, each atom has six nearest neighbors and six neighbors more distant, so that z - 4 and 6 < L _< 12, resulting in 4.5585 >_ v >_ 4.4888. For white tin, with z - 2 and four nearest neighbors and six somewhat more distant neighbors (4 _< L _< 6), eqns. (10) and (11) yield the range 2.5359 >_ v >_ 2.4949 for the value of the valence. For all of the hyperelectronic metals, v = z + 1/2 when the ligancy L - 2z + 1, corresponding to half bonds. In all of these cases, v is very nearly equal in value to z + 1/2, so that it is reasonable to assume half-integral values of
714 Table 2. Number of unsynchronized resonance structures per atom as a function of valence v and ligancy L for hyperelectronic metals. L\ v 1 2 2.40 3 2.88 4 3.29 5 3.66 6 4.00 7 4.32 8 4.61 9 4.88 10 5.14 11 5.39 12 5.62 13 5.85 14 6.07 15 6.28 16 6.48
2
3
4
5
6
7
8
2.88 3.84 4.83 5.83 6.82 7.82 8.81 9.81 10.81 11.80 12.80 13.79 14.79 15.79
3.29 4.83 6.59 8.51 10.60 12.83 15.20 17.70 20.32 23.06 25.92 28.88 31.94
3.66 5.83 8.51 11.71 15.40 19.59 24.28 29.46 35.15 41.33 48.01 55.18
4.00 6.82 10.60 15.40 21.29 28.33 36.59 46.14 57.04 69.33 83.08
4.32 7.82 12.83 19.59 28.33 39.30 52.74 68.91 88.06 110.43
4.61 8.81 15.20 24.28 36.59 52.74 73.35 99.08 130.62
4.88 9.81 17.70 29.46 46.14 68.91 99.08 138.08
the valence, as was done on an empirical basis in 1949 [27]. With this assumption that v = z + 1/2, the statistical treatment is the same as for hypoelectronic metals, except that the factor 21/2 must be introduced to account for the fact that there now exist two kinds of atoms (M + and M-), forming z + 1 bands. These differ in that M + does not have an unshared electron pair whereas M- does have one. Under these conditions, the equation for the number of unsynchronized resonance structures per atom for hyperelectronic metals is: 21/2 L! v (*-0/2
( L - V)(L-v+l)/2
VhV~,. = LL/2 ( V - 1/2)! ( L - v + 1/2)!
v(L-v+l/2) 1 + (v + 1 / 2 ) ( L -
] v)
(12) "
Similar to the situation for hypoelectronic metals, the ratio of the number of unsynchronized resonance structures per atom to the number of the synchronized resonance structures per atom is given by the expression in square brackets in eqn. (12). Values of Vhum. are given in Table 2 and shown graphically in Figure 3. For small v and L, the values of Uh~r,,,. are slightly less than those of vhupo; the opposite is true for large v and L. Again, Figure 3 shows that, for each value of L, a maximum in the unsynchronized resonance structures per atom for hyperelectronic metals occurs at v - L/2, as in the case for hypoelectronic metals. The preceding discussion applies to situations in which all the bonds have the same length. For a crystal in which an atom forms two kinds of bonds, that is, L1 bonds with bond number n~ - - V l / L 1 and L2 bonds with bond number n 2 - - v2/L2, eqn. (6) takes the
715 form
l']hypo - C
v+lE
max(n,L1)E
n=v--1
i=min(O,n-L2)
9
?3li (L -
~-~ (L2 - v2) L2-n+i v2
i! (L1 - i)!
(n -- i)! (L2 - n + i)!
(13) '
where
C-
L , L2[ L L~ v~ ~ ( L 1 -
v,) L'-v~ L L~ v~ 2 ( L 2 -
v2) g~-*~ ],/2
(14)
.
For a crystal containing hyperelectronic atoms with two different bond lengths, the number of unsynchronized resonance structures per atom becomes
- c
~~
,,+1/2
,~x(,-,,L~)
,=,,-,/2
i=,m,~(o,,-,-n=)
Z
Z:
i (L1 - i) L~-i
~'
i! ( L 1 -
i)!
v'~ -~ (L2 - v2) L~-'~+i (n-
i)! (L2 - n + i)!
'
(151
where C-
2 '/2 L, L2[ L L' v~'x ( L 1 - Vl) Lx-vl L L2 v~ ~ (L2 - v2) L~-~'2 ]1/2
.
(16)
When V l / L 1 - v 2 / L 2 - v / L , eqn. (13) reduces to eqn. (6) and eqn. (15) reduces to eqn.
(12).
5
CALCULATION OF THE NUMBER OF METALLIC ORBITALS PER ATOM FROM THE STATISTICAL THEORY OF THE UNSYNCHRONIZED RESONANCE OF COVALENT BONDS
From the Slater-Pauling curve for the saturation magnetic moment of the first-row transition metal alloys (Figure 1), it was found empirically that the number of metallic orbitals per atom has the value 0.72, corresponding to 28% M +, 44% M~ and 28~ M-. Based on the statistical treatment discussed in the preceding section, it is now possible to deduce this value on purely theoretical grounds [36].
716
First, the amount of metallic orbital per atom in a metal is given by the ratio of M + and M ~ to M +, M ~ and M- since M + and M ~ require an extra orbital for unsynchronized resonance to occur, whereas M- does not have this possibility according to the principle of electrical neutrality [45]. Moreover, the numbers of M + and M- must be equal and their sum equal to half of M + + 2 M ~ + M-. For a hypoelectronic metal with valency v and ligancy L, the theoretical value of w, the amount of metallic orbital per atom, is
w-
c+ 1 2c+ 1
(17) '
where
c--~
l ( v L-v) v+ 1 + L-v+
(18)
1
"
Therefore,
w-
4Lv - 4v 2 + 3L + 2 6 L V - 6v 2 + 4L + 2
.
(19)
The calculated values of this quantity are 0.684 _< w _< 0.707 for L - 12, 1 _< v _< 6, and all other values of L and v corresponding to known hypoelectronic metals. For hyperelectronic metals, the theoretical number of metallic orbitals per a t o m is given by
-
x+l
(20)
2
where x, the amount of M ~ is found from the simultaneous solution of the following three equations:
x-
d-
[ d ( d - 1)] ~/2
d-
(z+l)(L-z-l) L-2z-1
,
(21)
(22)
717
and
v-
z + 1- x
,
(23)
where z is the number of bonds formed by M ~ z + 1 is the number of bonds formed by M + and by M - , and M +, M ~ and M - occur in the amounts y, x, and y, respectively. Now let us consider the borderline composition Ni~Cul_~ at which the saturation ferromagnetic moment of the alloys of the first-row transition elements is zero. Eqn. (19) yields a value of co = 0.6842 for Ni, with L = 12 and v = 6. For the hyperelectronic metal Cu, with L = 12 and v = 5, eqn. (21)-(23) give x = 0.5035 and v = 5.4965, so that eqn. (20) yields co = 0.7518, somewhat larger than the value of co = 0.6842 for the hypoelectronic metal Ni. To obtain the value of co from the composition Ni~Cu~_~ at which the saturation ferromagnetic moment must be zero requires two steps. First, we consider the weighted mean of the values of co for Ni and Cu:
co - 0.6842c~ + 0.7518(1 - c~)
.
(24)
Second, we obtain another equation for co as a function of a by summing the number of available bonding orbitals. Of the nine outer orbitals (3d54s4pa), six are used, with six electrons, to form bonds. The alloy Ni~Cul_~ has 11 - a outer electrons, of which six are bonding electrons and 5 - c~ occupy 2.5 -c~/2 orbitals as unshared pairs. Therefore, 8 . 5 a / 2 orbitals are used for occupancy by unshared electron pairs, leaving 0.5 + c~/2 of the nine available orbitals to serve as the metallic orbital:
-
0.5
+
.
(25)
The simultaneous solution of eqns. (24) and (25) for co yields a = 0.444 and co = 0.722. The former value is in excellent agreement with the observed composition Ni44Cu56 at the foot of the Slater-Pauling curve, and the latter value is essentially the same as the empirically deduced value of 0.72 for the average number of metallic orbitals per atom. It should be emphasized that the derivation of co given here is completely theoretical, and as far as can be determined, the conventional band theory has never been employed to predict the composition Ni44Cu56 at which the saturation ferromagnetic moment of the first-row transition metal alloys has the value zero.
718
6
THE CRYSTAL STRUCTURES OF THE METALS AND THE MAXIMUM VALUES OF THE METALLIC VALENCE
Many physical properties of metals display an approximate correlation with the number of valence electrons of the periodic sequences of the elements. Lothar Meyer was the first to publish, in 1870, a graph of atomic volume as a function of atomic weight [49]. Rather than atomic volume as a function of atomic weight, it is interesting to consider the reciprocal of the atomic volume as a function of atomic number. With appropriate units, the reciprocal of the atomic volume is, in fact, the atomic density, or molarity, of the element in its pure state. In Figure 4 the molarity of the principal allotropes of the elements is shown plotted as a function of the atomic number z at STP [50]. According to the unsynchronized-resonating-covalent-bond theory of metals, the number of unsynchronized resonance structures per atom increases with increasing ligancy for each value of the valence, as shown in Figure 3. The increase in the number of resonance structures implies an increase in stability, so that a metal will have the structure with the maximum possible ligancy if other factors are not important. For over half of the elements, including most of the hypoelectronic metals, the crystal structure at STP is cubic or hexagonal closest packing, for which each atom has L = 12. Indeed, ligancy 12 is the maximum value allowed for atoms of the same size. For 15 metals, the cubic body-centered structure is the stable one. This A2 structure may also be considered to conform to the above prediction since, in this structure, each atom has eight nearest neighbors and six others further away, so that the effective ligancy is close to 12. In Figure 4 the dip in the curve for the left side of the first-row transition metals occurs for Mn, whose principal aJlotrope has an anomalous structure (complex body-centered cubic packing), which may be associated with Mn having a half-filled d shell. Since the maximum number of unsynchronized resonance structures occurs at valence L / 2 , or bond number n = v / L = 1/2, the maximum value of L for an elementary metal is 12, implying a maximum value of the metallic valence equal to 6. This value, or one only slightly larger, is what is found for the transition metals. In Figure 5 the portion of the second long row of the periodic table has been redrawn using different units for the ordinate. For the portion of this curve with increasing slope, the metallic valence increases from a value of 1 for Rb to a maximum of 6 for Mo to Pd.
v =
For the hyperelectronic metals, there exist unshared electron pairs, as well as valence electrons, in the outer shell. Since these unshared electron pairs on adjacent atoms strongly repel one another, the maximum ligancy should according be reduced. This is observed in metallic tin, gallium, and some other hyperelectronic metals. From Slater's treatment of molecules involving covalent bonds [51], it is found that the exchange energy providing most of the stability of a molecule occurs with the opposite sign and the factor 2 for the interaction energy of electron pairs on adjacent atoms [52]. This destabilizing effect reduces the maximum value of the ligancy in the hyperelectronic metals. Precise calculation of
719 Molarity
A
ClO
9 ea8
v
0
g ..z
o
N~
9 % 9
.
g
r~
9 go
J
9
,<=
9
9
m
o
i,,
l
=
9 9 O~
~
I
I
I
i
I
i
I
J
I
i
I
Figure 4. The atomic density, or molarity, of the principal all 9 of the elements as a function of atomic number Z at room temperature and atmospheric pressure. The values for the noble gases have been extrapolated to STP. For sulfur, the values are shown with increasing molarity in the order S~, c-$8, c-ST, and and c-$6. The values for phosphorous correspond to white and black phosphorus, respectively, in the order of increasing molarity. Lines corresponding to covalence 2 and 3 have been drawn for the rare-earth metals and for the actinide elements. From [50].
720
12 10
RECIPROCAL
Ru Rh
OFTHE
,-/:-~Vc---~'~.,.:-\',,__... -,-MOLAR V O L U " ' E, / ~ / / M ~ Pd . X ~ , ,
Ag~\
,~/Nb r
6
Sn
//
,n V
/~' Y /'7
4 2
Cd
',\
Rb l~lSrl tb
i
1 2 3 4
I
i
5 6
i
'
l
I
I
'
I
,
7 8 9 10 11 12 13 14
Number of outer electrons
Figure 5. Values of the reciprocal of the molar volume (mol m1-1 910-2), shown by circles, for the elements Rb to Sn in their metallic state. The full lines show the metallic valence, reaching the value 6 for Mo to Pd, and the dashed curve shows the number of unsynchronized resonance structures per atom for ligancy 12, multiplied by a scale factor.
the repulsion energy of unshared electron pairs is made difficult because the unshared d electrons occupy contracted orbitals, with decreased interatomic interactions [53]. A further complication exists for the hyperelectronic elements, namely the competition between the energy of stabilization of the substance afforded by the extra covalent bonds of the non-metallic form (with valence 1.44 greater than that for the metallic form) and that afforded by the extra resonance energy for the metallic form. Empirically it is observed that these two contributions are nearly equal for Sn. For the elements above and to the right of Sn, the non-metallic form is the more stable one; for those and to the left, the metallic form is the more stable one. More can be said concerning the difference between the normal, or non-metallic, covalence of a neutral element, having all of its stable orbitals occupied by single (bonding) electrons and by unshared pairs, and metallic valence, with its requirement of 0.72 metallic orbital per atom. For non-metals, on the one hand,
V,u~-,~taUk
-
-
2p-
Pc
,
(26)
with p - 4 for the short periods of the periodic table (sp a orbitals) or p - 9 for the long periods (dSsp a orbitals) and Pc equal to the number of outer electrons. On the other hand,
721 for metals, the valence is given by
v.~,o,~ - 2(p-
o.72) - p~
(27)
Some of the hyperelectronic metals, generally those with only a small number of unshared pairs, crystallize with closest packing (L -- 12). Examples are Co, Ni, and Cu and their congeners and also T1 and Pb. The other hyperelectronic metals tend to crystallize in structures with L = 2v, in accordance with these structures having the maximum number of unsynchronized resonance structures. Ga is noteworthy in this regard. From eqn. (27), v - 2(9 - 0.72) - 13 = 3.56, implying L ~ 7. Ga has a unique crystal structure in which each atom has seven nearest neighbors, with an associated bond number whose value is 1/2 (vide infra). Indium has a body-centered tetragonal structure (A3) in which each atom has four nearest neighbors and eight more distant neighbors, perhaps representing an effort to achieve ligancy 7. Thallium has a hexagonal close-packed structure, indicating that unshared electron pair repulsion is less important for it than for its lighter congeners. The nearly equal stability of the two forms of tin, metallic white Sn and non-metallic gray Sn, has been mentioned in the foregoing. From eqn. (26), v = 18 - 14 = 4 for gray tin, and gray tin has the diamond structure, with ligancy 4. This structure does not permit unsynchorized resonance, and consequently, gray tin is not a metal. For white tin, eqn. (27) yields v - 2(9 - 0.72) - 14 = 2.56. White tin has a body-centered tetragonal structure, with four nearest neighbors (corresponding to half bonds) and two neighbors about 0.2/~ further away (corresponding to quarter bonds). The bonds resonate among these six positions, and the effective ligancy has a value of about 5, or twice the metallic valence, as expected. A conventional quantum and thermodynamic treatment of the phase transition between gray tin and white tin has been published [54], but it is not clear from this paper why gray tin is a non-metal (the authors call it a "semimetal"), and why white tin is a metal. Zn (hexagonal close-packed crystal symmetry; A3 structure), Cd (hexagonal closepacked crystal symmetry; A3), and Hg (rhombohedral crystal symmetry; A3) all have metallic valence v = 4.56, and each has unusual crystal structures with arrangements such that each atom has six nearest neighbors and six neighbors farther away corresponding to bonds about half as strong. Thus, the effective ligancy can be considered to have a value of about 9, in keeping with the prediction from the unsynchronized-resonating-covalent-bond theory of metals that L = 2v. The reduction in ligancy from a value of 12 is due to keeping the number of contacts as small as possible in order to minimize the repulsion of unshared electron pairs.
722
7
THE COMPILATION OF METALLIC SINGLEBOND RADII AND RADII FOR LIGANCY 12
In 1947, Linus Pauling published a table of single-bond covalent radii derived from the observed interatomic distances in metals by correction for the fractional bond numbers and for the effect of resonance energy [39]. While this compilation has been of tremendous utility to experimentalists and theoreticians alike and is, indeed, one of Pauling's most cited papers [55], the values contained therein were usually somewhat less than the corresponding radii obtained from molecules and complex ions. The statistical theory of unsynchronized resonance described herein permits the calculation of the number of unsynchronized resonance structures per atom and the effective energy per bond. Pauling and his colleague Prof. Barclay Kamb employed this refined method of correcting the observed bond lengths for the effect of resonance energy to tabulate a new set of single-bond covalent radii (Table 3, [41]) which are in much better agreement with the corresponding values obtained from molecules and ions, especially for the enneacovalent single-bond radii derived from the analysis of the bond lengths observed in transition-metal molecules
[56-s1. The observed contraction of interatomic distances in metals arises from two factors. First, in the region near equilibrium, the resonance energy increases in magnitude with decrease in the interatomic distance, so that the bond length becomes shorter. Second, multiple bonding also decreases the interatomic distance. Based on the assumption that the shortening caused by resonance is constant for all bonds formed by the atom, Pauling and Kamb [41] formulated an equation for the observed bond length D in/1 as a function of the bond order n and the number of unsynchronized resonance structures u per bond (the value given in Tables 1 or 2 or calculated from eqns. (6), (12), (13), or (15) divided by half the valence):
D(n)=D(1)-A
log{n[l+B(u-1)]}
.
(2s)
Values of A = 0.700 /~ and B = 0.064 were selected to obtain agreement with known single-bond, tetrahedral, octahedral, and enneacovalent radii. The factor v - 1 is employed in eqn. (28) because there is no resonance energy for u = 1. By rearranging eqn. (28) into the form 10 [D(1)-D(n)]/A
n-
l+B(v-1)
'
(29)
it is possible to employ the values given in Table 3 to estimate the bond order n for bonds in metals, alloys, and intermetallic compounds. As an example, we shall return
723
Table 3. Metallic single-bond radii and radii for ligancy 12 (in/~). Element Li Be B C Na Mg A1 Si K Ca Sc T1 V Cr Mn Fe Co Ni Cu Zn Ga Ge Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sn
v
R1
R ( L - 12)
1 2 3 4 1 2 3 4 1 2 3 4 5 6 6 6 6 6 5.5 4.5 3.5 4 1 2 3 4 5 6 6 6 6 6 5.5 4.5 3.5 2.5 4
1.196 0.905 0.836 0.772 1.550 1.381 1.293 1.176 2.001 1.755 1.515 1.384 1.310 1.260 1.250 1.250 1.230 1.224 1.245 1.307 1.258 1.225 2.167 1.931 1.659 1.515 1.417 1.371 1.338 1.315 1.323 1.353 1.412 1.466 1.422 1.418 1.405
1.547 1.126 0.975 0.816 1.901 1.602 1.432 1.258 2.352 1.976 1.654 1.466 1.354 1.282 1.272 1.272 1.252 1.246 1.276 1.368 1.366 1.307 2.518 2.152 1.798 1.607 1.461 1.393 1.360 1.337 1.345 1.375 1.443 1.527 1.550 1.594 1.487
Element
v
R1
R ( L - 12)
Cs Ba La Ce Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb Lu Hf Ta W Re Os Ir Pt Au Hg T1 Pb Ra Ac Th Pa U
1 2 3 3 4 3 3 3 3 2 3 3 3 3 3 3 2 3 4 5 6 6 6 6 6 5.5 4.5 2 2 2 3 4 5 6
2.361 1.996 1.726 1.713 1.555 1.699 1.683 1.670 1.663 1.823 1.660 1.641 1.632 1.624 1.615 1.604 1.718 1.593 1.503 1.418 1.378 1.352 1.328 1.335 1.365 1.410 1.466 1.508 1.513 2.050 1.800 1.722 1.555 1.402
2.712 2.217 1.865 1.852 1.637 1.838 1.822 1.809 1.802 2.044 1.799 1.780 1.771 1.763 1.754 1.743 1.939 1.732 1.585 1.462 1.400 1.374 1.350 1.357 1.387 1.441 1.527 1.729 1.734 2.271 1.939 1.804 1.599 1.424
724 to the case of Ca. As mentioned previously, v = 3.56 and L = 7 for Ga. From eqn. (12), remembering that one can compute fractional values of factorials from the Gamma function, the number of unsynchronized resonance structures per atom is computed to be 8.75, so that the number of resonance structures per bond is 8.75/3.56 = 4.91. Each atom in the stable form of Ga has seven nearest neighbors, one at 2.465 A and two each at 2.700, 2.735, and 2.792 A. From eqn. (29), these lengths correspond to the bond numbers 0.945, 0.873, 0.778, and 0.645, respectively. If we multiply each value of n by the number of such bonds and sum, we obtain an approximate value of the valence equal to 3.24, in fair agreement with the assumed value of 3.56 and in almost exact agreement with the value obtained from a plot of molarity versus atomic number (Figure 3 of ref. [50]).
8
THE STRUCTURE AND PROPERTIES E M E N T A L B O R O N . IS IT A M E T A L ?
O F EL-
The unsynchronized-resonating-covalent-bond theory of metals predicts that all of the hypoelectronic elements should be metals since they all possess the metallic orbital. This is what is found to be the case, with the one apparent exception of elemental boron, which is a poor conductor of electricity at room temperature (though a good one at high temperature) and is usually classified as a metalloid or semiconductor. The reason that tetragonal boron does not display such characteristic properties of metals as a large value for the electrical conductivity, high malleability, and high ductility, is due to its unusual structure [59], which is shown in Figure 6. The basic structural feature is boron icosahedra. In the unit cell, there are 50 boron atoms, with 150 valence electrons and 148 boron-boron nearest neighbors (bond positions). These 50 boron atoms form four nearly regular icosahedra plus two interstitial boron atoms in the unit cell. Within each B12 icosahedron, each B atom has five adjacent atoms at the distance 1.797 + 0.015 A. Each icosahedral B atom has an interstitial B atom lying 1.62 + 0.02/~ from it. Furthermore, each of the interstitial B atoms has four nearest neighbors (icosahedral B atoms). This implies that the icosahedral B atoms have ligancy 6, and the interstitial B atoms have ligancy 4. We can now employ the unsynchronized-resonating-covalent-bond theory to resolve the seeming paradox of tetragonal boron being composed entirely of hypoelectronic elements and yet not displaying metallic properties. Using the values of v = 3 and L = 5 for the icosahedral B atoms and v = 3 and L = 4 for the interstitial B atoms, we see from Table 3 that there are 6.25 and 2.92 (more precisely, 2.9228 according to eqn. (6)) unsynchronized resonance structures per atom for the icosahedral and interstitial B atoms, respectively. This implies that the total number of unsynchronized resonance structures per B12 icosahedra is 6.250012 = 3.5527. 109, while the number involving interactions between the icosahedra is only 2.92284 = 72.9787, smaller by the factor 10 -s. Moreover, there are no atoms in the B icosahedra with neighbors such as to permit the occurrence of pivoting res-
725
i
I
! t I
I
I
Figure 6. Structure of tetragonal boron as viewed in the direction of the c axis. One unit cell is shown. Two of the icosahedral groups (light lines) are center at z = 3/4. The interstitial boron atoms (open circles) are at (0, 0, 0) and (1/2, 1/2, 1/2). The various structurally non-equivalent boron atoms are identified by numbers. All of the extra-icosahedral boron nearest-neighbor interactions are shown with the exception of B4-B4, which is found parallel to the c axis from each icosahedron to the icosahedra in cells directly above. From [30], p. 364.
726 onance to transfer a charge from one B12 unit to another. Thus, the electrical conductivity should be very small, as is found experimentally for relatively low temperatures. The fact that boron becomes a good electrical conductor at high temperatures may be due to its undergoing a phase transition at these temperatures. We may also employ eqn. (29) to estimate the bond numbers in tetragonal boron. From Table 3, we have D(1) - 1.672/~, leading to a normalized icosahedral bond number of 0.4776 and an interstitial bond number of 1.0170. Therefore, the average bond number per atom for the icosahedral B atoms, ~ o s , is 0.5068, and the average number of unsynchronized resonance structures for the icosahedral B atoms, P~os, is 6.0702. A check of the internal consistency of these numbers is provided by eqn. (28): D(n) - 1.7932 X. This compares with the experimental average icosahedral bond distance of 1.7874/~, which differs from the calculated bond distance by only 0.3 %. One may also ask why boron, as a hypoelectronic element, does not crystallize with ligancy 12. The explanation may be that, similar to the hypoelectronic metals, in which the unshared electron-pair repulsion reduces the maximum ligancy, boron is characterized as having repulsion by electron pairs, but in this case, electron pairs that are involved in bonding with other atoms. The tail of one orbital and the head of one orbital centered on another B atom also repel each other. These repulsions are diminished by having the tails directed away from the bond directions. In the B12 units, each B atom lies on an approximate fivefold rotational axis and directs five of its bonds along the adjacent edges of the icosahedron, with the sixth bond being directed outward along the rotational axis. Thus, six ligands are situated at six corners of an icosahedron around the first atom while the remaining six positions of the icosahedron are vacant. The tail of the orbital for the axial bond points to the center of the B12 icosahedron, and the tails of the bonds to the ring of five ligands lie in the vacant bond, occupying the other five icosahedral positions around the first atom. The application of the unsynchronized-resonating-covalent-bond theory to the investigation of the structure and properties of the boranes has been given elsewhere [43].
9
T H E N A T U R E OF T H E M E T A L - M E T A L B O N D IN ALLOYS, INTERMETALLIC COMPOUNDS, A N D O N T H E S U R F A C E S OF ALLOYS$
There are two factors at work in the formation of bonds between unlike atoms. First, it is well known that there is a polarization of the electron pair in bonds between unlike atoms. Indeed, when Pauling formulated the electronegativity scale in 1932 [60], he was motivated by evidence that, essentially without exception, single bonds between atoms A and B are SThis section is based in part on an unpublished manuscript written by Linus Pauling in 1992.
727 stronger than the average of the strengths of the single bonds A-A and B-B. He accounted for the extra stability by the difference in electronegativity values of the different kinds of bonded atoms. This discovery might have been made much earlier except for the fact that it was not known until about 75 years ago that dioxygen and dinitrogen do not contain single bonds, but rather multiple bonds that are more stable than any single-bonded structure. The increased stability of A-B single bonds is explained by polarization of the shared electron pair, which may be described as a contribution of an ionic structure A+B - to the normal covalent structure, that is, to the partial ionic character of the single bond. The amount of charge transferred by this mechanism is determined by the difference in electronegativity of A and B, leading Pauling to formulate the following expression for the bond energy E, in kcal/mol, for the bond A-B: 1
E ( A - B ) - ~ [ E ( A - A ) + E( B - B ) ] + 23(HA--
(30)
Ks )2
In this equation, HA refers to the Pauling electronegativity of atom A [30]. However, there is another mechanism for transferring electric charge from one atom to another. Electric charge may be transferred in such a way as to increase the number of bonds formed by both A and B or by one of them, or by increasing the number of stronger bonds at the expense of weaker bonds. An example is the crystal B N, with a structure related to that of diamond, with B and N alternating. In diamond, each carbon atom forms four single bonds with its neighbors. The close approximation of the bond length in BN to that of diamond indicates that in the BN crystal both B and N are forming four single bonds with their neighbors. B and N have four outer orbitals that can serve as bond orbitals or for occupancy by an unshared pair. With three and five outer electrons for B and N, respectively, each of these atoms is normally trivalent. Transfer of an electron from the unshared pair of nitrogen to a bond orbital of boron, yielding B- and N +, permits each of the atoms to be quadrivalent. This mechanism for electron transfer is essentially independent of the electronegativity difference of atoms A and B. For BN the direction of electron transfer is in fact opposite to that corresponding to the difference in electronegativity. B can be described as a hypoelectronic element, with fewer electrons than available bond orbitals, and N as a hyperelectronic element, with more electrons than available bond orbitals. Transfer of an electron from a hyperelectronic atom to a hypoelectronic atom leads to an increase in valence for each atom and to a formal charge of + 1 for one atom and -1 for the other atom. This electron transfer is sometimes in the direction corresponding to the partial ionic character resulting from difference in electronegativity, and sometimes in the opposite direction. For BN the difference in electronegativity indicates that each bond has about 22% partial ionic character according to the equation [61]:
Amount of ionic character = 1 - exp [ -
(XA -
XB
)2/4 ]
,
(31)
so that the four bonds formed by each atom reduce the actual charges to about + 0.12.
728
Table 4 CLASSIFICATION OF ATOMS WITH RESPECT TO EFFECT OF CHANGE OF ELECTRON NUMBER ON METALLIC VALENCE Hyperelectronic Atoms Hypoelectronic Atoms Atoms with Stable Valence N O F Li Be B C P S C1 Na Mg A1 Si Buffer Atoms K Ca Sc Ti V Cr a Mn Fe Co Ni Cu Zn Ga Ge As Se Rb Sr Y Zr Nb Mo a Tc Ru Rh Pd Ag Cd In Sn Sb Te Cs Ba La Ce b Lu Hf W W a Re Os Ir Pt Au Hg T1 Pb Bi Po a These three atoms can accept electrons but not give up electrons without change in valence. b The rare-earth metals may have some buffering power.
Br I At
The discussion of metallic valence and of electron transfer from hyperelectronic elements to hypoelectronic elements for metals, both in bulk alloys and on surfaces, is complicated somewhat by the need for consideration of the effect of the metallic orbital. As pointed out earlier, the metallic orbital, 0.72 per atom, on average, is required for the unsynchronized resonance of valence bonds characteristic of metals. For example, tantalum is hypoelectronic and copper is hyperelectronic, and accordingly, electron transfer from copper to tantalum is expected, leading to an increase in valence for both Ta and Cu and to increased strength of bonds [29]. This increased strength of bonds shows up in bulk alloys as an effect independent of the electron transfer induced by difference in electronegativity. In order to ascertain the factors that determine the average atomic volumes in alloys and intermetallic compounds, one must first consider the nature of the constituents. In 1950 [29], Pauling classified the metals in the periodic table as either hypoelectronic, buffer, or hyperelectronic, and he further extended his ideas during the ensuing 40 years [31, 40, 62-70]. Indeed, Pauling's interest in intermetallic compounds went back to his earliest crystal structure determinations while still a graduate student at the California Institute of Technology [71, 72]. As stated heretofore, a hypoelectronic atom has fewer valence electrons than available orbitals - - 3 electrons or less for the short periods and 6 electrons or less for the long periods. Buffer atoms only occur in the long periods and have a valence of 6 and 8.28 orbitals available in the outer shell for occupancy by bonding electrons or unshared electron pairs. Hyperelectronic atoms have more valence electrons than outer orbitals. Pauling's classification scheme is shown in Table 4. Owing to a deficiency of orbitals, hyperelectronic atoms have lowered formal valence because of a deficiency of orbitals. A buffer atom can accept or donate an electron without change in metallic valence from its normal valence. On the one hand, the hypoelectronic metals increase their valence by accepting an electron while, on the other hand, the hyperelectronic metals increase their valence by donating an electron. Since increased valence is associated with greater stability, other factors notwithstanding, electron transfer from a hyperelectronic atom to a hypoelectronic atom or a buffer atom, or from a buffer atom
729
Figure 7. The Friauf polyhedron. It consists of 12 smaller atoms at the corners of a truncated tetrahedron, which has four hexagonal faces and four triangular faces, and a larger atom at the center. In condensation, hexagonal faces are shared.
to a hypoelectronic compound in alloys and intermetallic compounds can be expected to occur, with concomitant decrease in the average atomic volume. Two metal atoms that are the same according to the above classification scheme, that is, both hypoelectronic, both buffer, or both hyperlectronic, and that are roughly the same size often form a complete series of solid solutions. In alloys of a hypoelectronic metal and a hyperelectronic metal, the effect of difference in electronegativity can become important, resulting in increased valence and decreased interatomic distances, so that the atomic volume is smaller than otherwise expected. This is, in fact, found in the series of compounds Ca2Pb, CasPba, CaPb, and CaPba between the hypoelectronic metal Ca and the hyperelectronic metal Pb [42]. In alloys of a hypoelectronic metal and a buffer metal (which can transfer an electron to a hypoelectronic metal without changing the valence, 6, of the buffer atom), the hypoelectronic atom can accept an electron to increase its valence and decrease its volume. This is what is observed in the series CoZr, CoZr2, CoZra, and Co2aZr6 [42]. Similarly, a hyperelectronic metal (Ga) can donate an electron to a buffer metal (Co) to increase the valence and decrease the size of the former, which is observed in the series Cos4Gal6, CoGa, and CoGa3 [42]. Metals can pack in myriad ways in the formation of alloys and intermetallic compounds. An interesting structural unit is the Friauf polyhedron (Figure 7), first reported by Friauf for the copper atoms in the 24-atom face-centered compound MgCu2 [73]. It consists of
730
Figure 8. The Mgz2(A1,Zn)49 structural unit containing 104 atoms and formed by twenty condensed Friauf polyhedra.
12 smaller atoms at the corners of a truncated terahedron surrounding a larger atom in the center. The Friauf polyhedron has four hexagonal faces and four triangular faces. By sharing their hexagonal faces such that their centers are situated at the corners of a pentagonal dodecahedron, it is possible for twenty Friauf polyhedra to condense together to form the truncated icosahedron shown in Figure 8. This truncated icosahedron contains 104 atoms. Using the stochastic method, Bergman, Waugh, and Pauling found in 1952 [64, 66, 74] that this trucated icosahedron is the structural unit of Mgz2(A1,Zn)49 and related phases. When Schechtman, Blech, Gratias, and Cahn reported in 1984 [75, 76] that they had found a rapidly cooled Mn-A1 alloy with the apparent composition MnA16 and having grains with fivefold symmetry and other symmetry elements of the icosahedral group, a great stir was raised in the scientific community because crystallography had as one of its foundation blocks the notion that crystals with fivefold symmetry could not exist. Many other such "icosahedral quasicrystals" were soon found thereafter as well as alloys called "decagonal quasicrystals" involving a single fivefold or tenfold axis [77-79]. Pauling countered the claim of Schechtman et al. that the x-ray powder diffraction pattern of the icosahedral phase of MnA16 could not be indexed with any Bravais lattice by showing that the patterns of the quasicrystals could be explained by twinning [80-94]. Pauling's explanation was based on the idea that the truncated icosahedron shown in Figure 8 is the basic structural unit of the icosahedral phase of MnA16. According to him,
731
!
9 |
Figure 9. The arrangement of the atoms in the ~-W structure. There are eight atoms in the primitive cube. Two atoms are centered at (0, 0, 0) and (1/2, 1/2, 1/2). Each of these atoms in surrounded by twelve atoms at the corners of a slightly distorted icosahedron.
eight such units condense to form the fl-W structure shown in Figure 9. The icosahedral nature of the clusters in such a cubic crystal explains the appearance of the Fibonacci numbers and the golden ratio in the lines of the x-ray diffraction patterns and also in the electron diffraction patterns and high-resolution electron micrographs of the quasicrystals. Nevertheless, it must be pointed out that Pauling's explanation has not been generally accepted. Indeed, alternative and widely accepted rationalizations invoking such concepts as Penrose tiling and icosahedral crystals in hyperspaces have been advanced to explain the anomalous existence of the quasicrystals [95-103].
732
10
SUPERCONDUCTIVITY INTERPRETED IN TERMS OF THE UNSYNCHRONIZED-RESONATING-COVALENT-BOND THEORY OF METALS
Any theory of the metallic state worth its salt must account for the phenomenon of superconductivity, in which the resistance of a substance is so small as not to be measurable. Indeed, anyone who has ever seen the manifestation of superconductivity will never forget his or her experience [104]. The generally accepted theory of superconductivity is based upon an assumed interaction at low temperatures betwen the conduction electrons at the Fermi surface and the phonons with the same wavelength in the crystal [105-108]. This interaction decreases the enthalpy for the superconducting state and produces a small gap at the Fermi surface, so that bands occupied by electrons are stabilized, and the adjacent bands are increased in energy. The existence of the gap results in the electrons no longer contributing to the heat capacity, in contrast to the normal metallic state. Associated with this are smaller values of the Hooke's-law force constants of the bonds, so that a metal is more easily deformed in the superconducting state than in the normal state. The electron-phonon-interaction theory is supported by the fact that the metals with largest conductivity at room temperature do not become superconductors, even at temperatures below about 0.2 K. Inasmuch as the normal resistance of metals arises from scattering of electron waves of the conducting electrons by phonons, low resistance in the no~'mal metallic state results from small electron-phonon interaction, which results also in decreased stability of the superconducting state. A characteristic feature of the superconducting state, as discovered by Cooper [109], Abrikosov [110], and Gor'kov [111], is that the effective charge carrier is 2e, rather than e. In 1968, Linus Panling was able to account for superconductivity in terms of the unsynchronized-resonating-covalent-bond theory of metals [112-114]. He began by assuming that a phonon is present in the crystal. Then, in the compressed regions of the crystal (the crests of the waves), corresponding to compressional waves of the phonon, the interatomic distances are decreased, and in the expanded regions (the troughs of the waves), they are increased. This means that the charged atoms in the crystal are under strain, that is, tension for M- is decreased in the crests and increased in the troughs, and tension for M + is increased in the crests and decreased in the troughs, according to eqn. (28). Consequently, the normal electronic state in the presence of phonons involves a larger concentration of M- in the crests of the wave and a larger concentration of M + in the troughs. The stability of the crystal is increased because unsynchronized resonance can take place in the crests between M- and M ~ (which have a metallic orbital) and the troughs between M ~ and M +. Let us first consider a hypoelectronic metal, such as A1, which has three valence electrons and four available orbitals. The A1 atom can accept an additional bonding electron,
733 resulting in valence 4. An A1 atom with valence 4 forms stronger bonds with its twelve neighbors than one with valence 3, so that we can expect a contracting around the AIatom. This implies a stabilizing interaction between AI- and the crest of the compressional phonon, so that A1 can be called a "crest superconductor." The situation is the opposite for a hyperelectronic metal, such as Ga, which has metallic valence 3.56. The possession of an extra electron there decreases the valence by one. Accordingly, a hyperelectronic metal is a "trough superconductor." Electron-phonon interaction would be the spoiling mechanism for setting a limit on the temperature Tc above which a substance no longer is superconducting because of the above considerations. One can further expect that small values of the electron-phonon interaction, and associated high To, can be obtained by high crest and trough superconductors in the same superconducting material. Furthermore, alloys of two hypoelectronic metals, or of two hyperelectronic metals, would have values of T~ intermediate between those for the individual metals because of the considerations associated with eqn. (29). The discovery of the high-temperature copper-oxide superconductors [115-117] provides an example wherein the concept of resonating covalent bonds might be of value in explaining the existence and behavior of such materials [113, 114, 118-120]. So far it has not been possible to explain the existence of these superconductors within the framework of the Bardeen-Cooper-Schrieffer (BCS) theory of superconductivity [107] although a quantum-gas model requiring only strong quantum interactions for a charged-particle gas has been developed to yield estimates for the superconducting transition temperatures for wide-ranging states of matter [121]. The unsynchronized-resonating-covalent-bond theory of metals, with electromagnetic interactions, provides a mechanism for keeping the electron pairs moving in the same direction. Ordinarily, the superconducting transition temperature is kept low due to the scattering of electrons by phonons, which also accounts for the electrical conductivity of metals in their normal state decreasing with increasing temperature. Nevertheless, electron-phonon scattering can be kept small by having both crest and trough superconductors in the material. As pointed out earlier, the charge of the superconducting current in a crest superconductor travels with the crest of the waves while the opposite is true for the case of a trough superconductor. It turns out that the high-temperature superconducting ceramic materials, such as (Y,La,Sr,Ba)2CuO4_y, satisfy just this requirement. They differ from other copper-oxygen compounds in that the CuO4 squares are condensed into infinite layers with each O atom lying midway between two Cu atoms: .--Cu--O.--Cu--O-.-Cu--O---. In such a line of atoms, there are only two structures, each extending the length of the line, and there is only a small probability of resonance occurring. However, unsynchronized resonance can occur if some O atoms are missing (Figure 10), so that vacancies interrupt the sequences. Also, there is some dismutation of 2Cu II into Cu I + Cu III. Cu I can pass its electron pair onto the adjacent Cu III by donating to the back lobe of a p orbital as the electron pair occupying the front lobe travels onto the next Cu III, converting it to Cu I. Similar processes involving CuII can also occur. Provided the segments are sufficiently short, then
734
Y B a 2 aM 3 06. 8
Z=l
Cu('m')
0
~Y Q Ba C
9 Cu
9 X vacant
0 sites
0-Z=O
Figure 10. Schematic representation of the structure of the superconducting phase of YBa2Cu306.s.
735
AN EXAMPLE: YBa2Cu306. 8 OXIDATION NUMBER 6 . 8 0 2" y3+ Ba 2+ + Ba -+ ONE Cu (111)SQUARE PLANAR TWO OTHER Cu, L = 5 SUM OF OXIDATION NUMBERS
-13.6 +3.0 +3.0 +3.0 +4.6 0
HENCE THE TWO Cu HAVE OXIDATION NUMBER +2.3, AND 0.7 METALLIC ORBITAL.
Figure 11. Analysis of the oxidation numbers in the superconducting phase of YBa2CuaO6.8.
the amount of unsynchronized resonance can become great enough so that the conducting state is more stable than the insulating state, as with metals. An additional requirement for high-temperature superconductivity is that such hypoelectronic atoms as La, Y, Ba, or Sr can interact with the hyperelectronic Cu atoms. This results in electron transfer from the Cu atoms to the hypoelectronic atoms, which leads to the formation of covalent bonds that resonate among the Y-Y and Y-Cu positions, conferring electronic conductivity on the substance. These two types of resonance caused by the combination of crest and trough metals couple with the phonons to yield superconductivity at relatively high temperatures. The above two requirements are met (1) if the hypoelectronic atoms retain some electrons and are the right distance apart to form resonating metal-metal bonds and (2) the hyperelectronic atoms must have such a valence that they provide about 0.72 metallic orbital per atom, so that unsynchronized resonance can occur. An example is provided by YBaqCuzO6.s. According to the scheme shown in Figure 11, the appropriate summation of the constituent oxidation numbers to yield a neutral compound requires that, on average, two of the three Cu atoms each has oxidation numbers +2.3 and ligancy 5, implying that they possess about 0.7 metallic orbital per atom. Finally, high superconducting transition temperatures are favored by the tight binding of atoms with low atomic weight, so that one would expect that Sc is more effective than Y in increasing the value of To, whereas insofar as the hyperelectronic metal is concerned, Cu, Ag, and Au are the only ones with
736
Ko,
I.~'
Ko,
14.24 ~
'~1 ""
Figure 12. The crystal structure of K3Cso: A view from the top (after [125]). Four K atoms located at the centers of the octahedral interstices, (1/2, 1/2, 0), etc., and eight K atoms located at the centers of the tetrahedral interstices, (1/4, 1/4, 1/4), etc.
the proper range of oxidation states although the latter two are too heavy and too loosely bound for one to expect them to be found in high-temperature superconductors. Another interesting application of the unsynchronized-resonating-covalent-bond theory of metals to superconductivity is the elucidation of the mechanism of superconductivity in the substances K3C60 and Rb3Cso, for which superconducting transition temperatures of 19.3 K [122] and 28 K [123], respectively, have been found [124]. The crystal structure of K3Cso was reported in 1991 [125]. The salient features of the structure are shown in Figure 12. The K3Cs0 crystal can be described as having the BiF3 structure, with the C60 molecules replacing Bi at the cubic face-center lattice points 0 0 0, etc. (two different and random orientations), four K atoms located at the centers of the octahedral interstices,
737 (1/2, 1/2, 0), etc., and eight K atoms located at the centers of the tetrahedral interstices, (1/4, 1/4, 1/4), etc. The value of the unit cube is 14.24 A, which is only 1% greater than that determined for C60, 14.11 A. C~0 itself is a truncated icosahedron with Ih symmetry and has 32 carbon rings, 12 of which are five-membered and 20 are six-membered. All of the carbon atoms in C6o are equivalent, and the cage radius R is given by
R-
1
~ [T 2 (?"1 -1- 27"2)2 --[-fl2 11/2
(32)
where T is the golden ratio, equal to (v~ + 1)/2, and rl and re are the two independent carbon-carbon bond lengths, which have the values 1.401/~ and 1.458 A, respectively [126]. Pauling's analysis [127] of the mechanism of superconductivity in KaC60 is as follows. The difference in electronegativity values for K and C, 1.7, corresponds to 51% ionic character according to eqn. 31. This amount of ionic character is compatible with the normal valence of 1 for K and with valence 2, corresponding to transfer of one electron from C to K, as well. Indeed, transfer of two or three electrons from C to K might also occur, giving rise to valence 3 or 4, respectively. Each of the tetrahedral K a t o m s (Ktetr lies 3.27 ~ from 24 C atoms (four C6 rings), corresponding to valence 4 and formal charge- 3. This is seemingly forbidden by the electroneutrality principle, but with the ,-~ 50% ionic character of the four bonds, the resultant charge on Ktetr is only -1, which is permitted by the electroneutrality principle. The cause of this unusual structure is the fitting of Ktetr into the small tetrahedral interstices. Each of the octahedral K atoms ( K ~ ) is bonded to 36 C atoms, 12 at 3.63 A and 24 at 3.83 A, corresponding to the normal valence of 1 for K, with metallic resonance among v = 0 (K+), v = 1 (K) and v = 2 (K-). The lattice constant is determined by the Ktetr-C and Koct-C interactions, as well as the Ca0-C~0 van der Waals interactions. Since the smallest K-K distance, 6.17 ~, is too large for K-K transfer, the metallic resonance results only from K-C60 interactions. From the above considerations, Ktetr cannot be involved in the electrical conductivity inasmuch as the resultant charge, -1, prevents it from accepting another bond; indeed, a fifth bond would require another bond orbital, which is not available. This leads to the conclusion that metallic conductivity results from Koct-C~ interactions, whereby Koct, resonating among K +, K ~ and K-, interacts with C~+, C~0+, and C~+. The positive charges can move in synchronism because the negative charges are immobilized on Ztetr and are not available to participate in annihilation. In contrast to the situation for the high-temperature ceramic superconductors, wherein lighter hypoelectronic elements give rise to higher To, the packing restriction o n Mtetr discussed above favors a higher Tc for a heavier alkali atom, which is what is observed experimentally.
738
11
CONCLUSIONS
Linus Pauling was graced with a long and productive life. With his encyclopedic knowledge of crystal and chemical structures, he was singularly suited to apply his theories to elucidate the nature of the chemical bond in its myriad manifestations. Those of us who are not similarly gifted would have wished him an even longer life to explain the perplexing problems in nature which are continually being revealed through the hands of many skilled scientists. This is especially true for the understanding of metals, alloys, and intermetallic systems. I would like to conclude with a citation from an historical reminiscence on his work involving metals that Pauling published near the end of his life [72]: "The resonating-valence-bond theory of metals is a quantum mechanical theory. There is no aspect of it that is not compatible with quantum mechanics. In fact, there is nothing that I have published since 1925 that is incompatible with quantum mechanics. I began the study of quantum mechanics as soon as Heisenberg had published his first paper, and I continued it, in 1926, while SchrSdinger was publishing his papers on wave mechanics. My first paper on quantum mechanics was published in 1926, and in the following ten years I published about 50 papers in this field, many of them relating to the nature of the chemical bond. In 1935, with my student E. Bright Wilson, I published a textbook on this subject, Introduction to Quantum Mechanics, which is still in print, without revision. Many solid-state physicists discuss the structure and properties of metals and alloys with use of the band theory, in its several modifications. This theory is also a quantum mechanical theory, which starts with a solution of the wave equation for a single electron, and introduces electron-electron correlation in one or another of several ways. The resonating-valence-bond theory introduces electron-electron correlation in several stages, one of which is by the formation of covalent bonds between adjacent atoms, and another the application of the electroneutrality principle to restrict the acceptable structures to those that involve only M +, M ~ and M-. It should be possible to find a relationship between the band-theory calculations and the resonating-covalent-bond theory, but I have been largely unsuccessful in finding such a correlation. I have, for example, not been able to find any trace of the metallic orbital in the bandtheory calculations, which thus stand in contrast to the resonating-valencebond theory, in which the metallic orbital plays a predominant role." It is hoped that the elucidation of the unsynchronized-resonating-covalent-bond theory presented in this chapter demonstrates that it is an intuitively appealing and useful adjunct to band theory in the interpretation of the often complex structures and properties displayed by metals, alloys, and intermetallic compounds. Furthermore, it is hoped that the foregoing presentation will encourage other researchers to investigate the explicit relationship between band theory and the unsynchronized-resonating-covalent-bond theory
739 and to discern the existence of the metallic orbital in band theory calculations. Indeed, according to the unsynchronized-resonating-covalent-bond theory, what distinguishes metals from other substances is that metallic properties are conferred on a substance if the metallic orbital is present to allow the occurrence of the unsynchronous resonance that affords the substance extra stabilization energy. This theory yields a non-integral value for the number of metallic orbitals per atom, namely, 0.72 orbital, because the number of metallic orbitals per atom represents an average value for the various degrees of ionization of an atom in a metal.
Acknowledgments Professor Barclay Kamb of the California Institute of Technology contributed to the development of some of the mathematical details discussed herein, and his name is cited in the references for the pertinent matters. I thank Jane Buechel and Dorothy Bruce Munro for proofreading the manuscript. I also thank Mrs. Munro for bringing to my attention the uncompleted and unpublished writings of Professor Pauling mentioned in Sections 1 and 9. Professor Jacob Bekenstein, Dr. Howard Houben, and David Meir-Levy are thanked for discussions concerning the possible existence of brass and bronze among the Hebrews during Biblical times.
REFERENCES 1. Binary Alloy Phase Diagrams, T. B. Massalski, J. L. Murray, L. H. Bennett, and J. H. Baker, eds., American Society for Metals, Metals Park, OH 44073, 1986, 2 volumes. 2. P. Villars and L. D. Calvert, Pearson's Handbook of Crystallographic Data for Intermetallic Phases, American Society for Metals, Metals Park, OH 44073, 1985, 3 volumes. 3. L. Pauling, The application of the quantum mechanics to the structure of the hydrogen molecule and the hydrogen molecule-ion and to related problems. Chem. Rev. 5, 173-213 (1928). 4. Z. S. Herman, Some early (and lasting) contributions of Linus Pauling in Quantum Mechanics and Statistical Mechanics. in Molecules in Natural Science and Medicine: An Encomiun for Linus Pauling, Z. B. Maksi5 and M. Eckert-Maksi6, eds., Ellis Horwood, New York, 1991, pp. 179-200. 5. J. C. Slater, Note on molecular structure. Phys. Rev. 41,255-257 (1932).
740 6. W. Pauli, Jr., Uber Gasenartung und Paramagnetismus. Z. Physik 41, 81-102 (1927). 7. A. Sommerfeld, Zur Elektronen Theorie der Metalle auf Grund der Fermischen Statistik, I. Tell: Allgemeines, StrSmungs-und Austrittsvorg/inge. Z. Physik 47, 1-32 (1928); II. Tell: Thermo-elektrische, galvano-magnetische und thermo-mechanische Vorg/inge. Z. Physik 47, 43-60 (1928). 8. W. V. Houston, Die Elektronemission kalter Metalle. Z. Physik 47, 33-37 (1928). 9. C. Eckert, Uber die Elektronentheorie der Metalle auf Grund der Fermischen Statistik, insbesonder fiber der Volta-Effekt. Z. Physik 47, 38-43 (1928). 10. J. Frenkel, Zur Wellenmechanischen Theorie der metallischen Leitfghigkeit. Z. Physik 47, 819-834 (1928). 11. W. V. Houston, Elektrische Leitfghigkeit auf Grund der Wellenmechanik. Z. Physik 48, 449-468 (1928). 12. F. Block, Uber die Quantenmechanik der Electronen in Kristallgittern. Z. Physik 52, 555-600 (1928). 13. A. Sommerfeld and N. H. Frank, The statistical theory of thermoelectric, galvano-, and thermomagnetic phenomena in metals. Rev. Mod. Phys. 3, 1-42 (1931). 14. J. C. Slater, The electronic structure of metals. Rev. Mod. Phys. 6, 209-280 (1934). 15. N. F. Mott and H. Jones, The Theory of the Properties of Metals and Alloys, Clarendon Press, Oxford, 1936. Reprinted by Dover Publications, New York, 1958. 16. A. H. Wilson, The Theory of Metals, Cambridge Univ. Press, London, 1936. 17. H. FrSlich, Elektrontheorie der Metalle, J. Springer, Berlin, 1936. 18. C. Kittel, Introduction to Solid State Physics, 4th ed., John Wiley and Sons, New York, 1971. 19. J. M. Ziman, Principles of the Theory of Solids, 2nd ed., Cambridge Univ. Press, London, 1972. 20. Physics of Metals. 1. Electrons, J. M. Ziman, ed., Cambridge Univ. Press, London, 1969. 21. W. Kohn and L. J. Sham, Self-consistent equations including exchange and correlation effects. Phys. Rev. A140, 1133-1138 (1965). 22. V. L. Moruzzi, J. F. Janak, and A. R. Williams, Calculated Electronic Properties of Metals, Pergamon Press, New York, 1978.
741 23. L. Pauling, The nature of the interatomic forces in metals. Phys. Rev. 54, 899-904 (1938). 24. L. Pauling, The metallic state. Nature 161, 1019 (1948). 25. F. J. Ewing and L. Pauling, The ratio of valence electrons to atoms in metals and intermetallic compounds. Rev. Mod. Phys. 20, 112-122 (1948). 26. L. Pauling, La Valence des M@taux et la Structure des Compos@s Interm@talliques. J. Chim. Phys. 46, 276-287 (1949). 27. L. Pauling, A resonating-valence-bond theory of metals and intermetallic compounds. Proc. Roy. Soc. (London) A196, 343-362 (1949). 28. L. Pauling, The resonating-valence-bond theory of metals. Physica 15, 23.28 (1949). 29. L. Pauling, Electron transfer in intermetallic compounds. (USA) 36, 533-537 (1950).
Proc. Natl. A cad. Sci.
30. L. Pauling, The Nature of the Chemical Bond, 3 ed., Cornell Univ. Press, Ithaca, NY, 1960, pp. 393-448. 31. L. Pauling, The electronic structure of metals and alloys, in Theory of Alloy Phases, American Society for Metals, Cleveland, OH, 1956, pp. 220-242. 32. L. Pauling, The nature of the metallic orbital. Nature 189, 656 (1961). 33. L. Pauling, The nature of the metallic orbital and the structure of metals. J. Indian Chem. Soc. 38, 435-437 (1961). 34. L. Pauling, The metallic orbital and the nature of metals. J. Solid State Chem. 54, 297-307 (1984). 35. B. Kamb and L. Pauling, Extensions of the statistical theory of resonating valence bonds to hyperelectronic metals. Proc. Natl. Acad. Sci. (USA) 82, 8284-8285 (1985). 36. L. Pauling and B. Kamb, Comparison of theoretical and experimental values for the number of metallic orbitals per atom in hypoelectronic and hyperelectronic metals. Proc. Natl. Acad. Sci. (USA) 82, 8286-8287 (1985). 37. L. Pauling and Z. S. Herman, Recent advances in the unsynchronized-resonatingcovalent-bond theory of metals, alloys, and intermetallic compounds and its application to the investigation of the structure of such systems, in Modelling of Structure and Properties of Molecules, Z. B. Maksid, ed., Ellis Horwood, Chichester, England, 1987, pp. 5-37. 38. L. Pauling and Z. S. Herman, The unsynchronized-resonating-covalent-bond theory of metals, alloys, and intermetallic compounds, in Valence Bond Theory and Chemical Structure, D. J. Klein and N. Trinastid, eds., Elsevier, Amsterdam, 1990, pp. 569-610.
742 39. L. Pauling, Atomic radii and interatomic distances in metals. J. Am. Chem. Soc. 69, 542-553 (1947). 40. L. Pauling, The dependence of the bond length in intermetallic compounds on the hybrid character of the bond orbitals. Acta Cryst. B24, 5-7 (1968). 41. L. Pauling and B. Kamb, A revised set of values of single-bond radii derived from the observed interatomie distances in metals by correction for bond number and resonance energy. Proc. Natl. Acad. Sci. (USA)83, 3569-3571 (1986). 42. L. Pauling, Factors determining the average atomic volumes in intermetallic compounds. Proc. Natl. Acad. Sci. (USA)84, 4754-4756 (1987). 43. L. Pauling and Z. S. Herman, The unsynchronized-resonating-covalent-bond theory of the structure and properties of boron and the boranes, in Advances in Boron and the Boranes, J. F. Liebman, A. Greenberg, and R. E. Williams, eds., VCH Publishers, New York, 1988, pp. 517-529. 44. W. Heisenberg, Zur Theorie des Ferromagnetismus. Z. Physik 49, 619-636 (1928). 45. I. Langmuir, Types of valence. Science 54, 59-67 (1921). 46. L. Pauling, The modern theory of valency. J. Chem. Soc. , 1461-1467 (1948). 47. C. Zener, Interaction between the d shells in the transition elements. Phys. Rev.) 81, 440-444 (1951). 48. L. Pauling, The structure and entropy of ice and of other crystals with some randomness of atomic arrangement. J. Am. Chem. Soc. 57, 2680-2684 (1935). 49. L. Meyer, Die Natur der chemischen Elemente als Funktion ihrer Atomgewichte. Annalen der Chemie u. Pharm. Suppl. Band 7, 354-358 (1870). 50. L. Pauling and Z. S. Herman, Molarity (atomic density) of the elements as pure crystals. J. Chem. Ed. 62, 1086-1088 (1985). 51. J. C. Slater, Molecular energy levels and valence bonds. Phys. Rev. 38, 1109-1144 (1931). 52. L. Pauling, The calculation of matrix elements for Lewis electronic structures of molecules. J. Chem. Phys. 1,280-283 (1933). 53. L. Pauling and I. Keaveny, Hybrid bond orbitals, in Wave Mechanics, W. C. Price, S. S. Chissick, and T. Ravensdale, eds., John Wiley and Sons, New York, 1973, pp. 83-97. 54. J. Ihm and M. L. Cohen, Equilibrium properties and the phase transition of grey and white tin. Phys. Rev. B23, 1576-1579 (1981).
743 55. Z. S. Herman, The twenty-five most cited publications of Linus Pauling. in Roots of Molecular Medicine: a Tribute to Linus Pauling, R. P. Huemer, ed., Freeman, New York, pp. 254-259. 56. L. Pauling, Metal-metal bond lengths in complexes of transition metals. Proc. Natl. Acad. Sci. (USA) 73, 4290-4293 (1976). 57. Z. S. Herman, Recent advances in simple valence-bond theory and the theory of hybrid bond orbitals. Intern. J. Quantum Chem. 23, 921-943 (1983). 58. L. Pauling and Z. S. Herman, Valence-bond concepts in coordination chemistry and the nature of metal-metal bonds. J. Chem. Ed. 61,582-587 (1984). 59. J. L. Hoard, R. E. Hughes, and D. E. Sands, The structure of tetragonal boron. J. Am. Chem. Soc. 80, 4507-4545 (1958). 60. L. Pauling, The nature of the chemical bond. IV. The energy of single bonds and the relative electronegativity of atoms. J. Am. Chem. Soc. 54, 3570-3582 (1932). 61. Ref. 30, pp. 64-107. 62. G. Tunell and L. Pauling, The atomic arrangements and bonds of the gold-silver ditellurides. Acta Cryst. 5,375-381 (1952). 63. D. P. Shoemaker, R. E. Marsh, F. J. Ewing, and L. Pauling, Interatomic distances and atomic valences in NaZn13. Acta Cryst. 5,637-644 (1952). 64. G. Bergman, J. L. T. Waugh, and L. Pauling, Crystal structure of the intermetallic compound Mg32(A1,Zn)t9 and related phases. Nature 169, 1057 (1952). 65. L. Pauling and P. Pauling, On the valence and atomic size of silicon, germanium, arsenic, antimony, and bismuth in alloys. Acta Cryst. 9, 127-130 (1956). 66. G. Berg-man, J. L. T. Waugh, and L. Pauling, The crystal structure of the metallic phase Mg32(A1,Zn)49. Acta Cryst. 10, 254-259 (1957). 67. L. Pauling, A set of effective metallic radii for use in compounds with the ~-wolfram structure. Acta Cryst. 10, 374-375 (1957). 68. L. Pauling and B. Kamb, The discussion of tetragonal boron by the resonating-valencebond theory of electron-deficient substances. Z. Kristall. 112, 472-478 (1959). 69. L. Pauling, Electron transfer and atomic magnetic moments in the ordered intermetallic compound A1Fea. in Quantum Theory of Atoms, Molecules, and the Solid State, P.-O. LSwdin, ed., Academic Press, New York, 1966, pp. 303-306. 70. L. Pauling, Electron transfer and the valence states of cerium and platinum in the cubic Friauf-Laves compounds with the platinum metals. Phys. Rev. Lett. 47, 277-281
744 (1981). 71. L. Pauling, The crystal structure of magnesium stannide. J. Am. Chem. Soc. 45, 2777-2780 (1923). 72. L. Pauling, The nature of metals. Pure ~ Appl. Chem. 61, 2171-2174 (1989). 73. J. B. Friauf, The crystal structures of two intermetallic compounds. J. Am. Chem. Soc. 49, 3107-3114 (1927). 74. Sten Samson, The structure of complex intermetallic compounds, in Structural Chemistry and Molecular Biology: A Volume Dedicated to Linus Pauling by his Students, Colleagues, and Friends, A. Rich and N. Davidson,, eds., W. H. Freeman, San Francisco, 1968, pp. 687-718. 75. D. Schechtman, I. Blech, D. Gratias, and J. W. Cahn, Metallic phase with long-range orientational order and no translational symmetry. Phys. Rev. Lett. 53, 1951-1953 (1984). 76. D. Schechtman and I. A. Blech, The microstructure of rapidly solidified A16Mn. MetaU. Trans. 16A, 1005-1112 (1985). 77. D. Schechtman, R. J. Schaefer, and F. S. Biancaniello, Precipitation in rapidly solidified A1-Mn alloys. Metall. Trans. 15A, 1987-1997 (1984). 78. I. Bendersky, Quasicrystal with one-dimensional translational symmetry and a tenfold rotation axis. Phys. Rev. Lett. 55, 1461-1463 (1985). 79. K. K. Fung, C. Y. Yang, Y. Q. Zhou, J. G. Zhao, W. S. Zhan, and B. G. Shen, Icosahedrally related decagonal quasicrystal in rapidly cooled Al-14-at.%-Fe alloy. Phys. Rev. Lett. 56, 2060-2063 (1986). 80. L. Pauling, Apparent icosahedral symmetry is due to directed multiple twinning of cubic crystals. Nature 317, 512-514 (1985). 81. L. Pauling, So-called icosahedral and decagonal quasicrystals are twins of an 820-atom cubic crystal. Phys. Rev. Lett. 58,365-368 (1987). 82. L. Pauling, Evidence from x-ray and neutron powder diffraction patterns that the so-called icosahedral and decagonal quasicrystals of MNAl~ and other alloys are twinned cubic crystals. Proc. Natl. Acad. Sci. (USA) 84, 3951-3953 (1987). 83. L. Pauling, Sigma-phase packing of icosahedral clusters in 780-atom tetragonal crystals of CrsNa3Si2 and V15Ni10Si that by twinning achieve 8-fold rotational point-group symmetry. Proc. Natl. Acad. Sci. (USA)85, 2025-2026 (1988). 84. L. Pauling, Structure of the orthorhombic form of Mn2A1T, Fe2A1T, and (Mn0.TFe0.3)2A1T that by twinning produces grains with decagonal point-group symmetry. Proc. Natl. Acad. Sci. (USA) 85, 2422-2423 (1988).
745 85. L. Pauling, Z. S. Herman, and P. J. Pauling, High-resolution transmission electronmicrograph evidence that rapidly quenched MnA16 and other alloys are icosatwins of a cubic crystal. Comptes Rendus Acad. Sci. Paris 306, S'erie II, No. 16, 1147-1151 (1988). 86. L. Pauling, Icosahedral quasicrystals are twins of cubic crystals containing large icosahedral clusters of atoms: The 1012-atom primitive cubic structure of A16CuLi3, the C-phase of A137Cu3Li21Mg3, and GaMg2Zna. Proc. Natl. Acad. Sci. (USA)85, 3666-3669 (1988). 87. L. Pauling, Additional evidence from x-ray powder diffraction patterns that icosahedral quasicrystals of intermetallic compounds are twinned cubic crystals. Proc. Natl. Acad. Sci. (USA)85, 4587-4590 (1988). 88. L. Pauling, Unified structure theory of icosahedral quasicrystals: Evidence from neutron diffraction patterns that A1CrFeMnSi, A1CuLiMg, and TiNiFeSi icosahedral quasicrystals are twins of cubic crystals containing about 820 or 1012 atoms in a primitive unit cube. Proc. Natl. Acad. Sci. (USA)85, 8376-8380 (1988). 89. L. Pauling, Interpretation of so-called icosahedral and decagonal quasicrystals of alloys showing apparent icosahedral symmetry elements as twins of an 820-atom cubic crystal. Computers Math. Applic. 17, 337-339 (1989). 90. L. Pauling, Icosahedral and decagonal quasicrystals as multiple twins of cubic crystals. in Extended Icosahedral Structures (Aperiodicity and Order, Vol. 3), M. V. Jarid and D. Gratias, eds., Academic Press, New York, 1989, pp. 137-162. 91. L. Pauling, Icosahedral quasicrystals of intermetallic compounds are icosahedral twins of cubic crystals of three kinds, consisting of large (about 5000 atoms) icosahedral complexes in either a cubic body-centered or a cubic face-centered arrangement or smaller (about 1350 atoms) icosahedral complexes in the ~-tungsten arrangement. Proc. Natl. Acad. Sci. (USA) 86, 8595-8599 (1989). 92. L. Pauling, Icosahedral and decagonal quasicrystals of intermetallic compounds are multiple twins of cubic or orthorhombic crystals composed of very large atomic complexes with icosahedral point-group symmetry in cubic close packing: Structure of decagonal A16Pd. Proc. Natl. Acad. Sci. (USA)86, 9637-9641 (1989). 93. L. Pauling, Evidence from electron micrographs that icosahedral quasicrystals are icosahedral twins of cubic crystals. Proc. Natl. Acad. Sci. (USA) 87, 7849-7850 (1990). 94. L. Pauling, Analysis of pulsed-neutron powder diffraction patterns of the icosahedral quasicrystals PdaSiU and A1CuLiMg (three alloys) as twinned cubic crystals with large units. Proc. Natl. Acad. Sci. (USA)88, 6600-6602 (1991). 95. D. Levine and P. J. Steinhardt, Quasicrystals: A new class of ordered structures. Phys. Rev. Lett. 53, 2477-2480 (1984).
746 96. P. Bak, Phenomenological theory of icosahedral incommensurate ("quasiperiodic") order in Mn-A1 alloys. Phys. Rev. Lett. 54, 1517-1519 (1985). 97. N. D. Mermin and S. M. Troian, Mean-field theory of quasicrystalline order. Phys. Rev. Lett. 54, 1524-1527 (1985). 98. L. A. Bursill and P. J. Lin, Penrose tiling observed in a quasi-crystal. Nature 316, 50-51 (1985). 99. T. Ogawa, On the structure of a quasicrystal. J. Phys. Soc. Japan 54, 3205-3208 (1985). 100. D. R. Nelson and B. I. Halperin, Pentagonal and icosahedral order in rapidly cooled metals. Science 229, 233-238 (1985). 101. P. J. Steinhardt, Quasicrystals. Am. Scientist 74, 586-597 (1986). 102. P. J. Steinhardt, Icosahedral solids: a new phase of matter?. Science 238, 1242-1247 (1987). 103. Quasicrystals, Networks, and Molecules of Fivefold Symmetry, I. Hargittai, ed., VCH Publishers, New York, 1990. 104. In the middle of the last decade, Linus Pauling asked me and two other associates, Ewan Cameron, MD, and Stephen Lawson, to verify experimentally an idea of his for raising the superconducting transition temperature of known superconductors. I worked intensively on this project for almost three years, and, while measurements made in the laboratory of Dr. Howard Hart at General Electric in Schennectady, NY, neither proved or disproved Pauling's idea, he was nevertheless successful in obtaining a patent on the idea: L. Pauling, Method of Drawing Dissolved Superconductor, U. S. Patent 5,158,588 (27 October 1992). Professor Pauling worked with us in the final stages of the experiment, and I shall never forget this 87-year-old man fearlessly working, with a blowtorch in hand, over an open oven at 1100 ~ 105. H. FrSlich, Theory of the superconducting state. I. The ground state at the absolute zero of temperature. Phys. Rev. 79, 845-856 (1950). 106. H. FrSlich, Isotope effect in superconductivity. Proc. Phys. Soc. (London) A63, 778 (1950). 107. J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Theory of superconductivity. Phys. Rev. 108, 1175-1204 (1957). 108. M. R. Schafroth, S. T. Butler, and J. M. Blatt, Quasichemical equilibrium approach to superconductivity. Helv. Phys. Acta 30, 93-134 (1957). 109. L. N. Cooper, Bound electron pairs in a degenerate Fermi gas. Phys. Rev. 104, 1189-1190 (1956).
747 110. A. A. Abrikosov, On the magnetic properties of superconductors of the second group. Sov. Physics JETP 5, 1174-1182 (1957). 111. L. P. Gor'kov, Macroscopic derivation of the Ginzburg-Landau equations in the theory of superconductivity. Soy. Physics JETP 36, 1364-1367 (1959). 112. L. Pauling, The resonating-valence-bond theory of superconductivity: Crest superconductors and trough superconductors. Proc. Natl. Acad. Sci. (USA) 60, 59-65 (1968). 113. L. Pauling, Influence of valence, electronegativity, atomic radii, and crest-tough interaction with phonons on the high-temperature copper-oxide superconductors. Phys. Rev. Lett. 59,225-227 (1987). 114. L. Pauling, The role of the metallic orbital and of crest and trough superconductors in high-temperature superconductors, in High Temperature Superconductivity: The First Two Years, R. M. Metzger, ed., Gordon and Breach Science Publishers, New York, 1989, pp. 309-313. 115. J. G. Bednorz and K. A. Mtiller, Possible high Tc superconductivity in the Ba-LaCu-O system. Z. Physik B64, 189-193 (1986). 116. C. W. Chu, P. H. Hor, R. L. Meng, L. Gao, Z. J. Huang, and Y. Q. Wang, Evidence for superconductivity above 40 K in the La-Ba-Cu-O Compound system. Phys. Rev. Lett. 58, 405-407 (1987). 117. Chemistry of High-Temperature Superconductors, A CS Symposium Series 351, D. L. Nelson, M. S. Whittingham, and T. F. George, eds., American Chemical Society, Washington, DC, 1987. 118. P. W. Anderson, Resonating valence bonds: A new kind of insulator?. Mat. Res. Bull. 8, 153-160 (1973). 119. P. W. Anderson, The resonating valence bond state in La~CuO4 and superconductivity. Science 23, 1196-1198 (1987). 120. P. W. Anderson G. Baskaran, Z. Zou, and T. Hsu, Resonating-valence-bond theory of phase transitions and superconductivity in La2CuO4-based compounds. Phys. Rev. Lett. 58, 2790-2793 (1987). 121. M. Rabinowitz, Quantum-gas model estimate for a wide range of superconducting critical temperatures. Intern. J. Theor. Phys. 28, 137-146 (1989). 122. A. F. Hebard, M. J. Rosseinsky, R. C. Haddon, D. W. Murphy, S. H. Glarum, T. T. M. Palstra, A. P. Ramirez, and A. R. Kortan, Superconductivity at 18 K in potassium-doped C60. Nature 350, 660-661 (1991). 123. K. Holczer, O. Klein, S.-M. Huang, R. B. Kaner, K.-J. Fu, R. L. Whetten, and F. Deiderich, Alkali-fulleride superconductors: Synthesis, composition, and diamagnetic
748 shielding. Science 252, 1154-1157 (1991). 124. J. Cioslowski, Electronic Structure Calculations on Fullerenes and Their Derivatives, Oxford Univ. Press, New York, 1995, pp. 247 ft. 125. P. W. Stephens, L. Mihaly, P. L. Lee, R. L. Whetten, S.-M. Huang, R. Kaner, F. Deiderich, and K. Holczer, Structure of single-phase superconducting K3C6o. Nature 351, 632-634 (1991 ). 126. Ref. 124, pp. 48 ft. 127. L. Pauling, The structure of K3C60 and the mechanism of superconductivity. Proc. Natl. Acad. Sci. (USA) 88, 9208-9209 (1991).
Z.B. Maksid and W.J. Orville-Thomas (Editors)
Pauling's Legacy: Modem Modelling of the Chemical Bond
749
Theoretical and Computational Chemistry, Vol. 6 9 1999 Elsevier Science B.V. All rights reserved.
Epilogue: L i n u s P a u l i n g , Q u i n t e s s e n t i a l C h e m i s t Dudley Herschbach Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford St., Cambridge, Massachusetts, USA Linus Pauling was born soon after Planck postulated the quantum, began his scientific work only a decade after Bohr proposed a magical model for the hydrogen atom and Laue devised x-ray diffraction, and received his Ph.D. just as Heisenberg and Schr6dinger were discovering quantum mechanics. Pauling went off to Europe, mastered the new theory, and returned to make Caltech for a quarter-century the Mecca of structural chemistry. With wonderful verve and insight, he developed a repertoire of heuristic concepts, semiempirical correlations and approximations, capable of elucidating a vast range of chemical phenomena. Moreover, as a charismatic teacher, lecturer, and writer, Pauling inspired a host of students and colleagues. His crowning achievement, published in 1951, was the discovery of the alpha-helix and beta-pleated sheet structures for proteins. In large part, he deduced these protein structures from data and principles he had obtained from close study of simple molecules. Thereby, Pauling provided the paradigm for the discovery in 1953 by Watson and Crick of the DNA double-helix structure. For biology as well as chemistry, the consequences of this splendid saga have become immense. This book does homage to Pauling in two ways. The various chapters amply demonstrate the wide scope and enduring vitality of concepts and tools contributed by him, as well as the influence of his ideas, right or wrong, on subsequent developments. Beyond that, however, the authors have strived to exemplify Pauling's approach to theoretical chemistry. This is adventurous yet pragmatic. It is uninhibited by pretensions to rigor, but rather seeks to find simple models that capture the main features. Models in the Pauling style provide only zeroth- or first-order approximations, but readily yield interpretations and predictions. The emphasis thus is on qualitative aspects and generalizations that serve to define generic behavior. It is an approach of great value, both in research and teaching. Here I describe a few personal encounters with Pauling, direct and indirect, to complement the previous chapters. These episodes may convey something of his human chemistry to readers who did not have the chance to
750 know him. Like many chemists of my vintage, I am among Pauling's scientific grandchildren. My Ph.D. mentor, E. Bright Wilson, was one of Pauling's most distinguished scientific sons. Furthermore, my scientific trajectory has intertwined with dozens of others from several generations of Pauling progeny. My first encounter with Pauling, years before I met him, dates from 1950, when I was a freshman at Stanford. My advisor, Harold Johnston, who had received his Ph.D. from Caltech, told me much about Linus and introduced me to his excellent book, General Chemistry. It was captivating, a luscious antidote to the dull tome that served as our official text. I particularly liked Pauling's blend of "real-life" descriptive chemistry with theory and his straightforward but magisterial style, augmented by the fine drawings by Roger Hayward. Later, from other Caltech alumni, I heard about Pauling's showmanship in teaching his freshman course. For instance, when Linus did a traditional demonstration, dropping bits of sodium into a bowl of water and igniting the hydrogen formed, he added an instructive twist. He became extremely excited, in imitation of a stereotypical mad chemist. He would shout and jump about, run to the other end of the lecture table, pour gasoline into a bowl, leap back, and throw in chunks of sodium. Amazed at the lack of an explosion, or any reaction, his frightened students had an unforgettable lesson. Pauling also often posed questions to the class, and the first student to answer was rewarded by receiving a candy bar tossed forth by Linus with gusto. In my senior year, I read the classic text by Pauling and Wilson, Introduction to Quantum Mechanics, for a course taught by Hal Johnston. That book, although published over 60 years ago and since copied in large part by many others, is still in print and unexcelled as a friendly introduction. It stemmed from notes that Bright Wilson, serving as a Teaching Fellow, took on Pauling's lectures. Bright told me that Linus, in response to a question, liked to pretend he had not considered it before. He would work out the mathematical solution on the blackboard, step-by-step, with commentary suggesting it was a fresh excursion. Often, however, Bright knew that Linus had carefully prepared the derivation before class. Bright also told me that, although he did not remember which of them had written various parts of the book, those done by Linus were turned out at a steady pace, without any revision. As a graduate student, I studied another celebrated Pauling book, The Nature of the Chemical Bond. I was much impressed to see how Linus combined pragmatic empirical analysis with insightful theory to develop a comprehensive--and comprehensible--picture of structural chemistry. Also, I remember noting his generous remarks in dedicating the book to Gilbert Newton Lewis. A few months before completing my Ph.D., I was startled to receive a letter from Pauling. In his capacity as Department Chairman, he invited me to visit Caltech and give two seminars, one dealing with my thesis work and
751 one with what I intended to do in the future. He said nothing to indicate that Caltech might be interested in me as a faculty candidate (nor had Bright given any hint). I have vivid memories of that visit, in February of 1958. Linus greeted me as if I were indeed his grandson, putting his arm around my shoulders as he ushered me into his office. It was spacious, and contained a profuse collection of molecular models, some with atoms nearly the size of basketballs and bonds like baseball bats. We discussed barriers to internal rotation of methyl groups. This was the topic of my first seminar, which described how Bright and his students determined such barriers from microwave spectra. The method exploited splittings produced by tunneling among the threefold potential energy m i n i m a for torsional reorientation of the methyl group about its axis. Linus had just submitted a paper pointing out that many aspects of the internal rotation barriers could be explained, at least qualitatively, by postulating substantial dorbital hybridization in the C-H bonds. His argument was based on the spherical harmonic addition theorem. However, in the paper Linus made no explicit mention of the theorem. Whereas typical authors would have been explicit, in order to display their erudition, it struck me that Linus was more astute. He presented matters in such a way that someone familiar with the theorem would see that Pauling had made use of it but others, blithely unaware, would nonetheless understand the key points. It was two-part harmony, an exemplary style often seen in his scientific writing. The next day I described my fledgling plans and hopes for molecular beam studies of chemical reaction dynamics. My talk began with praise of the pioneering molecular beam experiments of Otto Stern; however, when I ignorantly referred to him as a physicist, Linus shouted out: "Otto Stern was a physical chemist.!" Of course, it was so, and many times since I have enjoyed telling my research students how I learned that. But looking forward was the chief theme on this occasion. The splendid saga of structural chemistry and bonding theory, so greatly advanced by Pauling, encouraged the notion that something comparable might be achieved for molecular reaction dynamics. This "holy grail" was important for the mavericks who undertook to develop means to study individual molecular collisions. It helped to override the prevalent skepticism of wise elders, by fostering a sense of mission and evangelical fervor, like that so evident in Pauling. This visit was also memorable for revealing moments during an afternoon in which I accompanied Pauling while he ran some errands. As we got into his car, a convertible, he took childlike pleasure in demonstrating how to raise and lower the windows via little levers. On the way, he told story after story about the history of Caltech, including sketches of the background and work of his faculty colleagues. He also, with evident relish, mentioned that Harvard had tried in vain to hire him in 1928, then invited his colleague Richard Badger, and finally as a third choice got a young experimentalist from Princeton, George Kistiakowsky.
752 After stops at the post office and bank, we came to a local newspaper office. There Pauling submitted an article he'd written about fallout from testing of nuclear weapons. As we were about to go, he said to the editor, "You might like to have this, too," and drew from an envelope a large glossy photo of himself. Linus handed it over with the same broad smile and twinkling eye captured in his photograph. As we drove on, Pauling told me a story about Bright Wilson's Ph.D. thesis of 1933. It presented what was then a major calculation, a state-of-the-art treatment of the lithium atom. However, Caltech's Chemistry Department, like others in that era, had a rule that required a thesis to include experimental work. To fulfill that requirement, Pauling proposed that Bright measure the magnetic susceptibilities of a series of nitroso compounds, and lent him his roadster so that Bright could transport a Gouy balance apparatus from the Mount Wilson Observatory. The point of the story, in Pauling's rendition, was just that this was a easy way to cope with the rule, since Bright only needed a couple of afternoons to make the measurements. I didn't let on that I'd already heard the story from Bright, and that his version gleefully emphasized a different aspect, not mentioned by Pauling. The impetus to determine the magnetism of the nitroso compounds was to test predictions of a theory developed by Pauling. In fact, the experimental results disagreed entirely with the theory, an outcome that left Pauling confused and Wilson amused. The penultimate stop was Pauling's home in the Pasadena hills. He exclaimed that he hadn't intended to arrive there, the car had just followed a habitual path. As we approached the house, Pauling pointed out that the comers met at tetrahedral angles. Walking by the swimming pool, he mentioned that the royalties from his freshman book had paid for it. Just outside the door, he paused to direct my attention to a rather Baroque conical structure, fabricated from metal and plastic in the Caltech shops. It was a celebratory gift: a Christmas tree, assembled from stylized models of the alphahelix. We entered, Linus introduced me to his wife Ava Helen, his daughter Linda, and their dog. Then he immediately excused himself. Ava Helen was quite friendly and frank; in particular, she told me that she thought Bright Wilson "never did come to think we Westerners were quite civilized." (That I found curious, as Bright was from Tennessee.) After 45 minutes, Linus reappeared, bright and cheerful; nobody asked what he'd been up to, and we soon departed. In later years, I met Pauling again perhaps a dozen times, in a variety of situations. Particularly, I remember a fascinating lecture he gave at the Harvard Medical School, about his seminal work on sickle-cell anemia and kindred studies; and a passionate speech, delivered from the pulpit of a Cambridge church, against the Vietnam war. Nearly as impressive were a pair of talks Linus gave at a 1991 symposium at Caltech which celebrated his 90th birthday. The first, after dinner, traced his career from high school, to the Oregon Agricultural College, and beyond in terms of his fascination with the chemical bond. The second,
753 the next morning, focused on x-ray crystallography. After a brisk historical survey, Pauling presented his own analysis of recent work on quasicrystals. He was convinced that the evidence had been misinterpreted, and thought the effects seen could be attributed to twinning. Although subsequent work has shown he was wrong about quasicrystals, the vigor and lucidity of his talk were extraordinary, rarely attained by scientists fifty years younger. The symposium presented six other talks: Max Perutz, Francis Crick, and Alexander Rich emphasized the great impact of Pauling's ideas on the development of molecular biology; George Porter, John Polanyi, and I described aspects of chemical reaction dynamics, for which understanding of molecular structure and bonding have been vital prerequisites. All the symposium papers, including both by Pauling and an additional chapter on femtochemistry by Ahmed Zewail and Richard Bernstein, have been published in a book edited by Ahmed Zewail, who organized the birthday celebration. It is titled The Chemical Bond, Structure and Dynamics (Academic Press, 1992). This is particularly to be recommended to young chemists beginning research. Every chapter exemplifies how major advances in science result from enlisting technical resources in the service of an architectural vision. Finally, I want to relate a story told by Pauling on the last occasion at which I saw him, in 1993. It was at a dinner the evening before I was to give a lecture in his honor at Caltech, almost exactly 35 years since my first visit. Pauling told the story apropos the importance of teaching. He said that one day, just fifteen minutes before his class, the phone rang in his office. It was Ava Helen, very distressed. The woods not far away were on fire and the wind was blowing towards their house. Pauling said he replied, "Sorry, darling--I can't help; I have to go teach my class." He did send a postdoctoral fellow to spray the house with a garden hose, and fortunately the wind shifted so only Ava Helen's nerves were singed. With the look and tone of puzzled innocence, Linus concluded by saying, "But, you know, my wife wouldn't speak to me for weeks." His daughter Linda, also at the dinner, confirmed the story. Incongruous as it is, this story does convey the high priority Pauling gave to teaching. For him, research and teaching were kindred efforts, to be pursued with the same zest and devotion. Despite the splendid advances in computing power and experimental resolution now available or in prospect, chemistry remains a fabulously complex, multilayered science. As much as ever, it needs heuristic notions and simplistic models to guide thinking and facilitate communication. That is why Pauling's chemical odyssey is of much more than historical interest. He epitomizes the quintessence of chemical theory. In my view, graduate students, especially in physical chemistry or chemical physics, are often misled by typical theoretical courses and research projects. They are likely to think that a blackboard or journal article full of equations is theory. However, usually it is just a treatment that does not invoke ideas outside an already accepted conceptual framework. A genuine theory opens up new perspectives, often by postulating or guessing a short-cut.
754 Seeking out such possibilities calls for an enterprising, adventurous attitude. A good dose of Pauling can inculcate the taste for that! There is another way in which typical science courses handicap students. They instill the fear of being wrong, because so many academic exercises have only one right answer, to be found by some canonical procedure. In contrast, an esteemed colleague, the late Don Bunker, liked to say: "The task of a chemical theorist is to be wrong--in an interesting way." All genuine theories are wrong, at least in the sense of being incomplete, and often a theory can only be provisional. To be interesting, however, it has to provide new growing points for its subject. Again and again, Pauling did that. He did not fear being wrong, indeed liked to be seen as a maverick. He came to like it too much. In his Caltech days, Linus benefited greatly from sifting his profusion of ideas in discussions with colleagues such as Verner Schomaker, for whose expertise and judgment he had high regard. Later, without that help, Pauling often did not sufficiently credit contrary evidence, and acted as if he could not be wrong. He even presumed that scientists unwilling to accept his megavitamin recommendations without adequate testing had unworthy motivations. In this too, his career offers compelling lessons. It is apt to close by quoting remarks Pauling made to the university students of Sweden at the Nobel Prize banquet in December, 1954. As spokesman for all the Nobel Laureates, he spoke from a podium overlooking more than a thousand guests in the grand Town Hall of Stockholm, just after a torchlight procession of students had marched in, carrying myriad ceremonial banners. Pauling first noted the "world-wide brotherhood of youth and science." Then he said (italics in his original text): When an old and distinguished person speaks to you, listen to him carefully and with respect--but do not believe him. N e v e r put your trust in anything but your own intellect. Your elder,...no matter whether he is a Nobel Laureate, may be wrong. The world progresses, year by year, century by century, as the members of the younger generation find out what was wrong among the things that their elders said, so you must always be skeptical--always think for yourself.
755 INDEX
ab initio VB calculations 423
bicyclic lactams 324
additive fuzzy density fragmentation 585, 603, 606
binary systems 114 bond dissociation energy 190, 208, 209,303
adiabatic approximation 5
Born-Oppenheimer approximation 5, 21, 105, 366, 383, 473, 655 adjustable density matrix assembler method 585 boron 691,714 alloys 691,692, 693, 716 branching diagram 373, 389 alpha-helix 561,629, 631, 633,739 bridgehead lactams 327, 334 alpha-lactams 321 broken symmetry 8 alnide linkage 321,323, 328, 637 Brown's postulate 94 amino acids 629 Buckminsterfullerene 53 antiaromatic systems 493 carbocations 231,232,254 aromatic molecules 178,183 carbonyl complexes 546 aromatic stabilization 513 chemical potential 197 aromatic systems 493 clamped nuclei approximation 5, 21, 159 aromaticity 509 classical VB theory 371,380 avoided crossing model 428 Clebsch-Gordon expansion 42 Bader's topological atom 545 Clifford algebra 471 band theory 694 cluster methods 411 basis set superposition error 658 complete active space calculations 534, 536 BCS theory 723 configuration interaction 369, 396, 398, 405, 409 bent-bonds 61 benzene 396, 493,497
contact ion pairs 557
benzenoid aromatics 494
correlated gaussians 27, 29
beta-lactams 321
Coulson-Fisher AOs 367, 369, 476
beta-pleated sheets 561,629, 631,633,739
counterions 260
756 coupled cluster technique 412
effective charge 194
covalent-ionic resonance 465
electric dipole moment 160
covalent radii 192, 661
electric field 7
crystal packing 655
electric properties 159
cycloalkanes 310
electrochemical potential 190
cyclobutadiene 493, 501
electron affinity 192, 194, 484
cyclooctatetraene 493, 504
electron hopping 393
density functional theory 239, 604
electron population analysis 371
detonation pressure 347
electron spin pairing 367
detonation velocity 347
electronegativity 189, 201,716
deuterium 37
electronegativity equalization 196
Dewar-Chatt-Duncanson model 547
electroneutrality principle 697, 706, 727
Dewar structures 375
electrophilic centers 199
diamagnetic susceptibility 160
electrophilic reactivity 85, 494
diatomic dications 423
electrophilic regioselectivity 88
dipole polarizability 131
electrostatic interactions 564, 656, 669 673, 675
1,3-dipoles 533 Dirac-Fourier transform 214 Dirac-van Vleck vector model 379 dispersion energy 655, 663, 673, 676, 680 dissociation energy 190 DNA structure 561,627, 739 donor-acceptor bond 196
electrostatic potential 351 enantiomers 7 energy hypersurface 105 enthalpies of formation 303 enzyme action 584, 655 ESCA chemical shifts 204, 326
donor-acceptor complexes 246, 454
exchange interactions 655, 660, 662, 669, 675,678
donor-acceptor interactions 545
excited states 483
d-orbital participation 528
explicitly correlated geminals 27
dynamic dipole polarizability 141
explosives 347
757
extended It-systems 75
HOMO-LUMO interactions 568
external fields 7, 148
homodesmic reactions 88
F + colour centers 447
host-guest interactions 583
five-electron three center bond 456
Hubbard Hamiltonian 474
folding funnels 639
Hiickel Hamiltonian 474
force constants 149, 204
Hiickel 4n+2 rule 494
force field method 635
Hund's rule 501
four-electron three center bond 452
Hund's paradox 7
Friauf polyhedron 719
hybrid orbitals 213, 303, 532
Fukui function 199
hydrogen anion 37
functional groups 321,586
hydrogen bond 561
gauge-invariant AOs 168, 236, 239
hydrogen bond strength 562
gauge transformation 164
hydrogenic orbitals 213
Gegenbauer polynomials 214
hydrophilic effect 631
generalized VB theory 391
hydrophobic effect 629
generator coordinate method 5
hypercoordinate bonding 527
globular proteins 603, 629
hyperelectronic elements 691,701,706,717, 725
Green functions 411 group electronegativity 204
hyperpolarizability 129, 134
hardness 196
hypoelectronic elements 691,701,706,717, 725
hardness parameters 199, 201
hypoligated transition metal complexes 446
heat of formation 347
impact sensitivity 354
Heisenberg spin Hamiltonian 404, 409 Hellmann-Feynman theorem 156, 369
increased-valence structures 439,452, 459, 46;3, 470
high spin states 480
induction energy 655, 665, 674, 676, 685
high-temperature superconductivity 404, 409
inhomogenous electric field 13 intermetallic compounds
691,693, 716
758 intermolecular interactions 8, 657
metastable systems 105
ionization potential 192, 484
Metropolis algorithm 111
isodesmic reactions 339
Mills-Nixon effect 47
Kekul~ structures 75, 375, 407, 412, 481, 514
mixed parity states 6, 8
Kitaura-Morokuma decompositiom 568 Kohn-Sham theory 199 lactams 321,325 Lagrange's multipliers 155 Laguerre polynomials 214 Langevin term 160
molecular clusters 655 molecular dynamics simulations 646 molecular force constants 149 molecular geometry 2, 13 molecular graph 16 molecular magnetizability 147, 170 molecular mechanics 328
Laplacian of the charge density 661, 663, 668, 671
molecular polarizability 147
lattice model 639
molecular properties 149, 204
Levinthal paradox 630, 636, 651
molecular shape 65, 655, 660
lid algorithm 108
molecular similarity 584
London orbitals 162, 168
molecular structure 1, 15, 21
macromolecules 587, 603, 606
M511er-Plesset perturbation theory 59, 409, 546,
Madelung constant 103 main group elements 527 magnetic nuclear shielding 149 magnetic properties 519 many-body perturbation theory 410, 658 many-body VB theory 403, 409 Markovnikov's rule 190 Meissner effect 692 metallic bond 691 metallic orbital 695, 706, 725, 728
momentum distribution 214 momentum space 213 Monte-Carlo calculations 413, 646, 677 Mulliken's electronegativity 193 Mulliken magic formula 475 multiconfiguration VB theory 391 multipole expansion 663, 675 Morse function 569 natural bond orbitals 545
759 natural gauge origin 165
perfect pairing 371,375, 404
new energetic materials 347
permutation symmetry 14
nitroaromatics 352
pi-bond current 178
NMR chemical shifts 204, 232, 240, 242, 259, 289, 630
PISA model 241,245, 284
NO oxidation 470 non-adiabatic effect 37 non-adiabatic method 25 non-adiabatic wavefunction 42 nonalternant systems 481 nonaromatic systems 493 [N]-phenylenes 75 nuclear spin-spin coupling 149 one electron bond 440 optical activity 7 orbital electronegativity 196, 199, 201
point symmetry 14 polarizability 129, 134 potential energy surface 655, 657 PPP-VB method 471,476 proteins 603, 614, 627, 645 proton affinity 204, 327 pseudopotentials 546 reduced conformational space 639 rehybridization 53, 75 relativistic effects 4 renormalization group 409
orbital hardness 196
resonance 190,203, 326, 371,397, 403,405, 482,493, 509, 656, 697, 702
orbital radii 207
resonating VB theory 404
overlap enhanced orbitals 398, 399, 471, 477
reversed Mills-Nixon effect 79
paramagnetic susceptibility 160 Pariser-Parr-Pople Hamiltonian 471,473 Pascal's rule 172 Pauli principle 365, 385, 471,657, 665, 674 Pauling's bond valence rule 103, 124 Pauling's electronegativity 190 Pauling unit 199 peptide linkage 322
Rumer diagrams 377 second quantization 475 semiconductors 441,448 semiempirical calculations 337 sigma-ring current 172, 177 silaguanidinium cation 248 silylium ions 231,232, 287 shape similarity measure 597
760 shock sensitivity 347, 348
tetrahedral atom 303
singlet diradical structures 453
tetrahedrane 312
Slater-Pauling curve 695, 707, 729
three-electron bond 439
Slater's rule 194
tight-binding Hamiltonian 474
Slater-type AO's 213
transition metal compounds 545
SN2 reactions 455
trialkylsilylium ion 277
solvation 246
tritium 39
solvent interactions 233, 243
umbrella effect 557
space-inversion symmetry 3
unitary group approach 471
spectroscopic states 3, 6, 11
unsynchronized resonance 691, 697, 700, 722
spin-coupled VB theory 391,493, 495, 527, 532
valence bond method 365, 403, 405, 472, 494, 509, 511,545, 693
spin-free Hamiltonian 375 valence state 192 spin polarization 695 Van der Waals bond 655 spin waves 405, 411 static dielectric susceptibility 11 superconductivity 404, 449, 691,722, 723 supermolecule 588, 659 surface electrostatic potentials 351 symmetry adapted perturbation theory 657 symmetry breaking phenomena 8 symmetry operations 7 synchronized resonance 697, 700 Tamm-Dancoff approximation 475 tetracoordinated carbon 303 tensor of inertia 5 ternary ionic systems 119
Van der Waals complex 657 Van der Waals radii 656 VB coupled cluster method 483 vibrational spectra 575 weak interactions 655 Wheland a-complex 85, 90, 278, 501 Wiberg bond index 557 Wigner number 374 Wigner operator 374 Wolfsberg-Helmholtz formula 428 Woodward-Hoffmann rules 493 Zener theory 698 zwitterions 501